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THERAPEUTIC COMPOSITIONS 

RELATED APPUCATIONS 
The present application claims its benefit of Application No. 60/452,682, filed 
5 March 7, 2003; Application No. 60/462,894, filed April 14, 2003 ; and Application 
No. 60/465,665, filed April 25, 2003; Application No. 60/463,772, filed April 17, 2003; 
Application No. 60/465,802, filed April 25, 2003; Application No. 60/493,986, filed 
August 8, 2003; Application No. 60/494,597, filed August 1 1, 2003; Application No. 
60/506,341, filed September 26, 2003; Application No. 60/518,453, filed November 7, 2003; 
1 0 Application No. 60/454,265, filed March 1 2, 2003; Application No. 60/454,962, filed March 
13, 2003; Application No. 60/455,050, filed March 13, 2003; Application No. 60/469,612, 
filed May 9, 2003; Application No. 60/510,246, filed October 9, 2003; Application 
No. 60/510,318, filed October 10, 2003. The contents of these provisional applications are 
hereby incorporated by reference in their entirety. 

15 

TECHNICAL FIELD , 
The invention relates to RNAi and related methods, e.g., methods of making and 
using iRNA agents. 

BACKGROUND 

20 RNA interference or ''RNAi" is a term initially coined by Fire and co-workers to 

describe the observation that double-stranded RNA (dsRNA) can block gene expression 
when it is introduced into worms (Fire et al (1998) Nature 391, 806-81 1). Short dsRNA 
directs gene-specific, post-transcriptional silencing in many organisms, including vertebrates, 
and has provided a new tool for studying gene function. RNAi may involve mRNA 

25 degradation. 
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SUMMARY 

A number of advances related to the application of RNAi to the treatment of subjects 
are disclosed herem. For example, the invention features iRNA agents targeted to specific 
genes; palindromic iRNA agents; iRNA agents havmg non canonical monomer pairings; 

5 iRNA agents having particular structures or architectures e.g., the Z-X-Y or asymmetrical 
iRNA agents described herein; drug delivery conjugates for the delivery of iRNA agents; 
amphipathic substances for the delivery of iRNA agents, as well as iRNA agents having 
chemical modifications for optimizing a property of the iRNA agent. The invention features 
each of these advances broadly as well as in combinations. For example, an iRNA agent 

10 targeted to a specific gene can also include one or more of a palindrome, non canonical, Z-X- 
Y, or asymmetric structure. Other noniimiting examples of combmations include an 
asymmetric structure combined with a chemical modification, or formulations or methods or 
routes of delivery combined with, e.g., chemical modifications or architectures described 
herein. The iRNA agents of the invention can include any one of these advances, or pairwise 

16 or higher order combinations of the separate advances. 

In one aspect, the invention features iRNA agents that can target more than one RNA 
region, and methods of using and making the iRNA agents. 

In another aspect, an iRNA agent includes a first and second sequence that are 
sufficiently complementary to each other to hybridize. The first sequence can be 

20 complementary to a first target RNA region and the second sequence can be complementary 
to a second target RNA region. 

In one embodiment, the first and second sequences of the iRNA agent are on different 
RNA strands, and the mismatch between the fu:st and second sequences is less than 50%, 
40%, 30%, 20%, 10%, 5%, or 1%. 

25 In another embodiment, the fu-st and second sequences of the iRNA agent are on the 

same RNA strand, and in a related embodiment more than 50%, 60%, 70%, 80%, 90%, 95%, 
or 1% of the iRNA agent is in bimolecular form. 

In another embodiment, the first and second sequences of the iRNA agent are fiiUy 
complementary to each other. 

30 In one embodiment, the furst target RNA region is encoded by a furst gene and the 

second target RNA region is encoded by a second gene, and in another embodunent, the first 
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and second target RNA regions are different regions of an RNA from a single gene. In 
another embodiment, the first and second sequences differ by at least 1 and no more than 6 
nucleotides. 

In certain embodiments, the first and second target RNA regions are on transcripts 
5 encoded by first and second sequence variants, e.g., first and second alleles, of a gene. The 
sequence variants can be mutations, or polymorphisms, for example. 

In certain embodiments, the first target RNA region includes a nucleotide 
substitution, insertion, or deletion relative to the second target RNA region. 

In other embodiments, the second target RNA region is a mutant or variant of the fu:st 

1 0 target RNA region. 

In certain embodiments, the first and second target RNA regions comprise viral, e.g., 
HCV, or human RNA regions. The first and second target RNA regions can also be on 
variant transcripts of an oncogene or include different mutations of a tumor suppressor gene 
transcript. In one embodiment, the oncogene, or tumor suppressor gene is expressed in the 

15 liver. In addition, the first and second target RNA regions correspond to hot-spots for 
genetic variation. 

In another aspect, the invention features a mixture of varied iRNA agent molecules, 
including one iRNA agent that includes a first sequence and a second sequence sufficiently 
complementary to each other to hybridize, and where the first sequence is complementary to 

20 a first target RNA region and the second sequence is complementary to a second target RNA 
region. The mixture also includes at least one additional iRNA agent variety that includes a 
third sequence and a fourth sequence sufficiently complementary to each other to hybridize, 
and where the third sequence is complementary to a third target RNA region and the fourth 
sequence is complementary to a fourth target RNA region. In addition, the first or second 

26 sequence is sufficiently complementary to the third or fourth sequence to be capable of 

hybridizing to each other. In one embodiment, at least one, two, three or all fom- of the target 
RNA regions are expressed in the liver. Exemplary RNAs are transcribed from flie apoB-100 
gene, glucose-6-phosphatase gene, beta catenin gene, or an HCV gene. 

In certain embodiments, the first and second sequences are on the same or different 

30 RNA strands, and the third and fourth sequences are on same or different RNA strands. 
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In one embodiment, the mixture further includes a third iRNA agent that is composed 
of the first or second sequence and the third or fourth sequence. 

In one embodiment, the first sequence is identical to at least one of the second, third 
and fourth sequences, and in another embodiment, the first region differs by at least 1 but no 
5 more than 6 nucleotides from at least one of tlie second, third and fourth regions. 

In certain embodiments, the first target RNA region comprises a nucleotide 
substitution, msertion, or deletion relative to the second, third or fourth target RNA region. 

The target RNA regions can be variant sequences of a viral or human RNA, and in 
certain embodiments, at least two of the target RNA regions can be on variant transcripts of 
10 an oncogene or tumor suppressor gene. In one embodiment, the oncogene or tumor 
suppressor gene is expressed in the liver. 

In certain embodiments, at least two of the target RNA regions correspond to hot- 
spots for genetic variation. 

In one embodiment, the iRNA agents of the invention are formulated for 
15 pharmaceutical use. In one aspect, the invention provides a container (e.g., a vial, syringe, 
nebxilizer, etc) to hold the iRNA agents described herein. 

Another aspect of the invention features a method of making an iRNA agent. The 
method includes constructing an iRNA agent that has a first sequence complementary to a 
first target RNA region, and a second sequence complementary to a second target RNA 
20 region. The first and second target RNA regions have been identified as being suflBciently 
complementary to each other to be capable of hybridizing. In one embodiment, the first and 
second target RNA regions are on transcripts expressed m the liver. 

In certain embodiments, the first and second target RNA regions can correspond to 
two different regions encoded by one gene, or to regions encoded by two different genes, 
26 Another aspect of the invention features a method of making an iRNA agent 

composition. The method includes obtaining or providing information about a region of an 
RNA of a target gene (e.g., a viral or human gene, or an oncogene or tumor suppressor, e.g., 
p53), where the region has high variability or mutational fi:equency (e.g., in humans). In 
addition, information about a plurality of RNA targets within the region is obtained or 
30 provided, where each RNA target corresponds to a different variant or mutant of the gene 
(e.g., a region including the codon encoding p53 248Q and/or p53 249S). The iRNA agent is 
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constructed such that a first sequence is complementary to a first of the plurality of variant 
RNA targets (e.g., encoding 249Q) and a second sequence is complementary to a second of 
the pluraUty of variant RNA targets encoding 249S). The first and second sequences 
are sufficiently complementary to hybridize. In certain embodiments, the target gene can be 

6 a viral or human gene expressed in the liver. 

In one embodiment, sequence analysis, e.g, to identify common mutants in the target 
gene, is used to identify a region of the target gene that has high variability or mutational 
frequency. For example, sequence analysis can be used to identify regions of apoB-100 or 
beta catenin that have high variability or mutational frequency. In another embodiment, the 

10 region of the target gene having high variability or mutational frequency is identified by 
obtaining or providing genotype information about the target gene from a population. In 
another embodiment, the genotype information can be from a population suffering from a 
liver disorder, such as hepatocellular carcinoma or hepatoblastoma. 

Another aspect of the invention features a method of modulating expression, e.g., 

1 5 downregulating or silencmg, a target gene, by providing an iRNA agent that has a first 

sequence and a second sequence sufficiently complementary to each other to hybridize. In 
addition, the first sequence is complementary to a first target RNA region and the second 
sequence is complementary to a second target RNA region. 

In one embodiment, the iRNA agent is administered to a subject, e.g., a human. 

20 In another embodiment, the first and second sequences are between 15 and 30 

nucleotides in length. 

In one embodiment, the method of modulating expression of the target gene further 
includes providing a second iRNA agent that has a third sequence complementary to a third 
target RNA region. The third sequence can be sufficiently complementary to the first or 
25 second sequence to be capable of hybridizing to either the first or second sequence. 

Another aspect of the invention features a method of modulating expression, e.g., 
downregulating or silencing, a plurality of target RNAs, each of the plurality of target RNAs 
corresponding to a different target gene. The method includes providing an iRNA agent 
selected by identifying a first region in a first target RNA of the plurality and a second region 
30 in a second target RNA of the plurality, where the first and second regions are sufficiently 
complementary to each other to be capable of hybridizing. 
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In another aspect of the invention, an iRNA agent molecule includes a first sequence 
complementary to a first variant RNA target region and a second sequence complementary to 
a second variant RNA target region, and the first and second variant RNA target regions 
correspond to first and second variants or mutants of a target gene. In certain embodiments, 
5 the target gene is an apoB-100, beta catenin, or glucose-6 phosphatase gene. 

In one embodiment, the target gene is a viral gene (e.g., an HCV gene), tumor 

suppressor or oncogene. 

In another embodiment, the first and second variant target RNA regions include 

allelic variants of the target gene. 
1 0 In another embodiment, the furst and second variant RNA target regions comprise 

mutations (e.g., point mutations) or polymorphisms of the target gene. 

In one embodiment, the furst and second variant RNA target regions correspond to 

hot-spots for genetic variation. 

Another aspect of the invention features a plurality (e.g., a panel or bank) of iRNA 
15 agents. Each of the iRNA agents of the plurality includes a furst sequence complementary to 
a furst variant target RNA region and a second sequence complementary to a second variant 
target RNA region, where the first and second variant target RNA regions correspond to first 
and second variants of a target gene. In certain embodiments, the variants are aUelic variants 
of the target gene. 

20 Another aspect of the invention provides a method of identifying an iRNA agent for 

treating a subject. The method includes providing or obtaining mformation, e.g., a genotype, 
about a target gene, providing or obtaining information about a plurality (e.g., panel or bank) 
of iRNA agents, comparing the information about the target gene to information about the 
plurality of iRNA agents, and selecting one or more of the plurality of iRNA agents for 

25 treating the subject. Each of the plurality of iRNA agents includes a first sequence 

complementary to a furst variant target RNA region and a second sequence complementary to 
a second variant target RNA region, and the first and second variant target RNA regions 
correspond to first and second variants of the target gene. The target gene can be an 
endogenous gene of the subject or a viral gene. The information about the plurality of iRNA 

30 agents can be the sequence of the first or second sequence of one or more of the plurality. 
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In certain embodiments, at least one of the selected iRNA agents includes a sequence 
capable of hybridizing to an RNA region corresponding to the target gene, and at least one of 
the selected iRNA agents comprises a sequence capable of hybridizing to an RNA region 
corresponding to a variant or mutant of the target gene. 
5 In one aspect, the invention relates to compositions and methods for silencing genes 

expressed in the liver, e.g., to treat disorders of or related to the liver. An iRNA agent 
composition of the mvention can be one which has been modified to alter distribution in 
favor of the liver. 

In another aspect, the invention relates to iRNA agents that can target more than one 

10 RNA region, and methods of using and making the iRNA agents. In one embodiment, the 
RNA is from a gene that is active in the liver, e.g., apoB-100, glucose-6-phosphatase, beta- 
catenin, or Hepatitis C virus (HCV). 

In another aspect, an iRNA agent includes a first and second sequence that are 
sufficiently complementary to each other to hybridize. The first sequence can be 

1 5 complementary to a first target RNA region and the second sequence can be complementary 
to a second target RNA region. For example, the first sequence can be complementary to a 
first target apoB-100 RNA region and the second sequence can be complementary to a 
second target apoB-100 RNA region. 

In one embodiment, the first target RNA region is encoded by a first gene, e.g., a 

20 gene expressed in the liver, and the second target RNA region is encoded by a second gene, 
e.g., a second gene expressed in the hver. In another embodiment, the first and second target 
RNA regions are different regions of an RNA from a single gene, e.g., a single gene that is at 
least expressed in the liver. In another embodiment, the first and second sequences differ by 
at least one and no more than six nucleotides. 

25 In another embodiment, sequence analysis, e.g., to identify common mutants in the 

target gene, is used to identify a region of the target gene that has high variability or 
mutational fi^quency. For example, sequence analysis can be used to identify regions of 
aopB-100 or beta catenm that have high variability or mutational frequency. In another 
embodiment, the region of the target gene having high variability or mutational frequency is 

30 identified by obtaining or providing genotype infonnation about the target gene from a 
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population. In particular, the genotype information can be fiom a population suffering ftom 
a liver disorder, such as hepatocellular carcinoma or hepatoblastoma. 

In another aspect, the invention features a method for reducing apoB-100 levels in a 
subject, e.g., a mammal, such as a human. The method includes administering to a subject an 
iKNA agent which targets apoB-100. The iRNA agent can be one described here, and can be 
a dsRKA that has a sequence that is substantially identical to a sequence of the apoB-100 
gene. The iKNA can be less than 30 nucleotides in length, e.g., 21-23 nucleotides. 
Preferably, the iRNA is 21 nucleotides m length. In one embodiment, the iKNA is 21 
nucleotides ip length, and the duplex region of the iRNA is 19 nucleotides, hi another 
embodiment, the iRNA is greater than 30 nucleotides in length. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences listed in Tables 5 and 6. hi a preferred embodunent it targets both 
sequences of a palindromic pair provided in Tables 5 and 6. The most preferred targets are 
listed in descending order of preferrabUity, in other words, the more preferred targets are 

listed earlier in Tables S and 6. 

In a preferred embodiment the iRNA agent will include regions, or strands, which are 
complementary to a pair in Tables 5 and 6. In a preferred embodiment the iRNA agent will 
include regions complementary to the paUndromic pairs of Tables 5 and 6 as a duplex region. 

In a preferred embodunent the duplex region of the iRNA agent will target a sequence 
listed in Tables 5 and 6 but wiU not be perfectly complementary with the target sequence, 
e.g., it will not be complementary at at least 1 base pak. Preferably it will have no more than 
1, 2, 3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere hereia but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e,g., by additional bases to form a hairpin, or by other non-base linkers. 

The iRNA agent that targets apoB-100 can be administered in an amount sufficient to 
reduce expression of apoB-100 mRNA. In one embodiment, the iRNA agent is administered 
in an amount sufficient to reduce expression of apoB-100 protein (e.g., by at least 2%, 4%, 
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6%, 10%, 15%, 20%). Preferably, the iRNA agent does not reduce expression of apoB-48 
mRNA or protein. This can be effected, e.g., by selection of an iRNA agent which 
specifically targets the nucleotides subject to KNA editing in the apoB-100 transcript. 

The iRNA agent that targets apoB-100 can be administered to a subject, wherein the 
5 subject is suffering from a disorder characterized by elevated or otherwise unwanted 
expression of apoB-100, elevated or otherwise unwanted levels of cholesterol, and/or 
disregulation of lipid metabolism. The iRNA agent can be administered to an individual at 
risk for the disorder to delay onset of the disorder or a symptom of the disorder. These 
disorders include HDL/LDL cholesterol imbalance; dyslipidemias, e.g., famiUal combined 
10 hyperlipidemia (FCHL), acquired hyperlipidemia; hypercholestorolemia; statin-resistant 
hypercholesterolemia; coronary artery disease (CAD) coronary heart disease (CHD) 
atherosclerosis. In one embodiment, the iRNA that targets apoB-100 is administered to a 
subject suffering from stathi-resistant hypercholesterolemia. 

The apoB-100 iRNA agent can be administered in an amount sufficient to reduce 
15 levels of serum LDL-C and/or HDL-C and/or total cholesterol in a subject. For example, the 
iRNA is administered in an amount sufacient to decrease total cholesterol by at least 0.5%, 
1%, 2.5%, 5%, 10% in the subject. In one embodiment, the iRNA agent is administered in 
an amount sufficient to reduce the risk of myocardial infarction the subject. 
In a preferred embodiment the lElNA agent is administered repeatedly. 
20 Administration of an iRNA agent can be carried out over a range of tune periods. It can be 
administered daily, once every few days, weekly, or monthly. The timing of administration 
can vary from patient to patient, depending on such factors as the severity of a patient's 
symptoms. For example, an effective dose of an iRNA agent can be administered to a patient 
once a month for an indefinite period of time, or until the patient no longer requires therapy. 
25 In addition, sustained release compositions contaming an iRNA agent can be used to 
maintain a relatively constant dosage in tiie patient's blood. 

In one embodiment, the iRNA agent can be targeted to the liver, and apoB expression 
level are decreased in the liver following administration of the apoB iRNA agent. For 
example, tiie iRNA agent can be complexed with a moiety tiiat targets tiie liver, e.g., an 
30 antibody or ligand that binds a receptor on the liver. 
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The iRNA agent, particularly an iRNA agent that targets apoB, beta-catenin or 
glucose-6-phosphatase RNA, can be targeted to the liver, for example by associating, e.g., 
conjugating the iRNA agent to a lipophilic moiety, e.g., a lipid, cholesterol, oleyl, retinyl, or 
cholesteryl residue (see Table 1). Other lipophilic moieties that can be associated, e.g., 

6 conjugated with tiie iRNA agent include cholic acid, adamantane acetic acid, 1-pyrene 
butyric acid, dihydrotestosterone, l,3-Bis-0(hexadecyl)glycerol, geranyloxyhexyl group, 
hexadecylglycerol, bomeol, menthol, 1,3-propanediol, heptadecyl group, pahnitic acid, 
myristic acid,03-(oleoyl)lithocholic acid, 03-(oleoyl)cholenic acid, dimethoxytrityl, or 
phenoxazine. In one embodiment, the iRNA agent can be targeted to the liver by associating, 

10 e.g., conjugating, the iRNA agent to a low-density lipoprotein (LDL), e.g., a lactosylated 
LDL. In another embodiment, the iRNA agent can be targeted to the liver by associatuig, 
e.g., conjugating, the iRNA agent to a polymeric carrier complex with sugar residues. 

In another embodiment, the iRNA agent can be targeted to the liver by associating, 
e.g., conjugating, the iRNA agent to a liposome complexed with sugar residues. A targeting 

15 agent that incorporates a sugar, e.g., galactose and/or analogues thereof, is particularly useful. 
These agents target, in particular, the parenchymal cells of the liver (see Table 1). In a 
preferred embodiment, the targeting moiety includes more than one galactose moiety, 
preferably two or three. Preferably, the targeting moiety includes 3 galactose moieties, e.g., 
spaced about 1 5 angstroms from each other. The targeting moiety can be lactose. A lactose 

20 is a glucose coupled to a galactose. Preferably, the targeting moiety includes three lactoses. 
The targeting moiety can also be N-Acetyl-Galactosamine, N-Ac-Glucosamine. A mannose, 
or mannose-6-phosphate targeting moiety can be used for macrophage targeting. 

The targeting agent can be linked directly, e.g., covalently or non covalently, to the 
iRNA agent, or to another delivery or formulation modality, e.g., a liposome. E.g., the iRNA 

25 agents with or Avithout a targetmg moiety can be incorporated into a delivery modality, e.g., a 
liposome, with or without a targeting moiety. 

It is particularly preferred to use an iRNA conjugated to a lipophilic molecule to 
conjugate to an iRNA agent that targets apoB, beta-catenin or glucose-6-phosphatase iRNA 
targeting agent. 

30 In one embodiment, the iRNA agent has been modified, or is associated with a 

delivery agent, e.g., a delivery agent described herein, e.g., a liposome, which has been 
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modified to alter distribution in favor of the liver. In one embodiment, the modification 
mediates association with a serum albumin (SA), e.g., a human serum albumm (HSA), or a 
firagment thereof. 

The iKNA agent, particularly an iRNA agent that targets apoB, beta-catenin or 
glucose-6-phosphatase RNA, can be targeted to the liver, for example by associating, e.g., 
conjugating the iRNA agent to an SA molecule, e.g., an HSA molecule, or a fragment 
thereof. In one embodiment, the iRNA agent or composition thereof has an affmity for an 
SA, e.g., HSA, which is sufficiently high such that its levels in the liver are at least 10, 20, 
30, 50, or 100% greater in the presence of SA, e.g., HSA, or is such that addition of 
exogenous SA will increase delivery to the liver. These criteria can be measured, e.g., by 
testing distribution in a mouse in the presence or absence of exogenous mouse or human S A. 

The SA, e.g., HSA, targeting agent can be linked directly, e.g., covalentiy or non- 
covalentiy, to tiie iRNA agent, or to anotiier delivery or formulation modality, e.g., a 
liposome. E.g., the iRNA agents with or without a targeting moiety can be incorporated into 
a delivery modality, e.g., a liposome, with or without a targeting moiety. 

It is particularly preferred to use an iRNA conjugated to an SA, e.g., an HSA, 
molecule wherein tiie iRNA agent is an apoB, beta-catenin or glucose-6-phosphatase iRNA 
targeting agent. 

In another aspect, the invention features, a method for reducing glucose-6- 
phosphatase levels in a subject, e.g., a mammal, such as a human. The method includes 
administering to a subject an iRNA agent which targets glucose-6-phosphatase. The iRNA 
agent can be a dsRNA that has a sequence that is substantially identical to a sequence of the 
glucose-6-phosphatase gene. 

In a preferred embodiment, tiie subject is treated witii an iRNA agent which targets 
one of tiie sequences Usted in Table 7. In a preferred embodiment it targets both sequences 
of a palindromic pair provided in Table 7. The most preferred targets are listed in 
descending order of preferrability, in otiier words, tiie more preferred targets are listed earlier 
in Table 7. 

In a preferred embodiment tiie iRNA agent will include regions, or strands, which are 
complementary to a pah: in Table 7. In a preferred embodunent tiie iRNA agent vsdll include 
regions complementary to the palindromic pairs of Table 7 as a duplex region. 
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In a preferred embodiment the duplex region of the iRNA agent will target a sequence 
listed in Table 7 J)ut will not be perfectly complementary with the target sequence, e.g., it 
will not be complementary at at least 1 base pair. Preferably it will have no more than 1 , 2, 
3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere herein but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to the gene sequences bemg targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a hairpin, or by other non-base linkers. 

Table 7 refers to sequences from human glucose-6-phosphatase. Table 8 refers to 
sequences from rat glucose-6-phosphatase. The sequences from table 8 can be used, e.g., in 
experiments with rats or cultured rat cells. 

In a preferred embodiment iRNA agent can have any architecture, e.g., architecture 
described herein. E.g., it can be incorporated into an iRNA agent having an overhang 
structure, overall length, hairpin vs. two-strand structure, as described herein. In addition, 
monomers other than naturally occurrmg ribonucleotides can be used m the selected iRNA 
agent. 

The iRNA that targets glucose-6-phosphatase can be administered in an amoimt 
sufficient to reduce expression of glucose-6-phosphatase mRNA. 

The iRNA that targets glucose-6-phosphatase can be administered to a subject to 
inhibit hepatic glucose production, for the treatment of glucose-metabolism-related disorders, 
such as diabetes, e.g., type-2-diabetes mellitus. The iRNA agent can be administered to an 
individual at risk for the disorder to delay onset of the disorder or a symptom of the disorder. 

In other embodiments, iRNA agents having sequence similarity to the following 
genes can also be used to inhibit hepatic glucose production. These other genes include 
"forkhead homologue in rhabdomyosarcoma (FKHR); glucagon; glucagon receptor; 
glycogen phosphorylase; PPAR-Gamma Coactivator (PGC-1); Fructose- 1,6-bisphosphatase; 
glucose-6-phosphate locator; glucokinase inhibitory regulatory protein; and 
phosphoenolpyruvate carboxykinase (PEPCK). 
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In one embodiment, the iRNA agent can be targeted to the liver, and RNA expression 
levels of the targeted genes are decreased in the liver following administration of the iRNA 

agent. i 

The iRNA agent can be one described herein, and can be a dsRNA that has a 
5 sequence that is substantially identical to a sequence of a target gene. The iRNA can be less 
than 30 nucleotides in length, e.g., 21-23 nucleotides. Preferably, the iRNA is 21 nucleotides 
in length. In one embodiment, the iRNA is 21 nucleotides in length, and the duplex region of 
the iKNA is 19 nucleotides. In another embodiment, the iRNA is greater than 30 nucleotides 
in length 

10 In another aspect, the invention features a method for reducing beta-catenin levels in 

a subject, e.g., a mammal, such as a human. The method includes administering to a subject 
an iRNA agent that targets beta-catenin. The iRNA agent can be one described herein, and 
can be a dsRNA that has a sequence that is substantially identical to a sequence of the beta- 
catenin gene. The iRNA can be less than 30 nucleotides in length, e.g., 21-23 nucleotides. 

1 5 Preferably, the iRNA is 2 1 nucleotides in length. In one embodiment, the iRNA is 2 1 
nucleotides in length, and the duplex region of the iRNA is 19 nucleotides. In another 
embodiment, the iRNA is greater than 30 nucleotides in length. 

In a preferred embodiment, the subject is treated with an iRNA agent which tai^ets 
one of the sequences listed in Table 9. In a preferred embodiment it targets both sequences 

20 of a palindromic pair provided in Table 9. The most preferred targets are listed in 

descending order of preferrabiUty, in other words, the more preferred targets are listed earUer 
in Table 9. 

In a preferred embodiment, the subject is treated with an iRNA agent which targets 
one of the sequences listed in Table 9. In a prefened embodiment it targets both sequences 
25 of a palindromic pair provided in Table 9. The most preferred targets are listed in 

descending order of preferrabiUty, in other words, the more preferred targets are Usted earlier 
in Table 9. 

In a preferred embodunent the iRNA agent will include regions, or strands, which are 
complementary to a pair m Table 9. In a preferred embodiment the iRNA agent will mclude 
30 regions complementary to the palindromic pairs of Table 9as a duplex region. 
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In a preferred embodiment the duplex region of the iRNA agent will target a sequence 
listed in Table 9 but will not be perfectly complementary with the target sequence, e.g., it 
will not be complementary at at least 1 base pair. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent mcludes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3' overhangs. Overhangs are discussed in detail 
elsewhere herein but are preferably about 2 nucleotides m length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a hairpin, or by other non-base linkers. 

The iRNA agent that targets beta-catenin can be administered in an amount sufficient 
to reduce expression of beta-catenin mRNA. In one embodiment, the iRNA agent is 
administered in an amount sufficient to reduce expression of beta-catenin protein (e.g., by at 

least 2%, 4%, 6%, 10%, 15%, 20%). 

The iRNA agent that targets beta-catenin can be administered to a subject, wherein 
the subject is suffering from a disorder characterized by unwanted cellular proliferation in the 
liver or of liver tissue, e.g., metastatic tissue originating from the liver. Examples include , a 
benign or malignant disorder, e.g., a cancer, e.g., a hepatocellular carcinoma (HCC), hepatic 
metastasis, or hepatoblastoma. 

The iRNA agent can be administered to an individual at risk for the disorder to delay 

onset of the disorder or a symptom of the disorder 

In a preferred embodiment the iRNA agent is administered repeatedly. 

« 

Administration of an iRNA agent can be carried out over a range of time periods. It can be 
administered daily, once every few days, weekly, or monthly. The timing of administration 
can vary from patient to patient, depending on such factors as the severity of a patient's 
symptoms. For example, an effective dose of an iRNA agent can be administered to a patient 
once a month for an mdefinite period of time, or until the patient no longer requires therapy. 
In addition, sustained release compositions containing an iRNA agent can be used to 
maintain a relatively constant dosage in the patient's blood. 

In one embodnnent, the iRNA agent can be targeted to the liver, and beta-catenin 
expression level are decreased in the liver following administration of the beta-catenin iRNA 
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agent. For example, the iRNA agent can be complexed with a moiety that targets the liver, 
e.g., an antibody or ligand that binds a receptor on the liver. 

In another aspect, the invention provides methods to treat liver disorders, e.g., 
disorders characterized by unwanted cell proliferation, hematological disorders, disorders 
characterized by inflammation disorders, and metabolic or viral diseases or disorders of the 
liver. A proliferation disorder of the liver can be, for example, a benign or malignant 
disorder, e.g., a cancer, e.g, a hepatocellular carcinoma (HCC), hepatic metastasis, or 
hepatoblastoma. A hepatic hematology or mflammation disorder can be a disorder involving 
clotting factors, a complement-mediated inflammation or a fibrosis, for example. Metabolic 
diseases of the liver can include dyslipidemias, and irregularities in glucose regulation. Viral 
diseases of the liver can include hepatitis C or hepatitis B. In one embodiment, a liver 
disorder is treated by administering one or more iRNA agents that have a sequence that is 
substantially identical to a sequence in a gene involved in the Uvct disorder. 

In one embodiment an iRNA agent to treat a liver disorder has a sequence which is 
substantially identical to a sequence of the beta-catenin or c-jun gene. In another 
embodiment, such as for the treatment of hepatitis C or hepatitis B, the iRNA agent can have 
a sequence that is substantially identical to a sequence of a gene of tiie hepatitis C virus or the 
hepatitis B virus, respectively. For example, tiie iRNA agent can target the 5' core region of 
HCV. This region lies just downstitiam of the ribosomal toe-print straddling the initiator 
methionine. Alternatively, an iRNA agent of the invention can target any ope of tiie 
nonstructural proteins of HCV: NS3, 4A, 4B, 5A, or 5B. For tiie treatinent of hepatitis B, an 
iRNA agent can target the protein X (HBx) gene, for example. 

In a preferred embodiment, tiie subject is ti^ated witii an iRNA agent which targets 
one of tiie sequences listed in Table 10. In a preferred embodiment it targets botii sequences 
of a palindromic pair provided in Table 10. The most prefened targets are listed in 
descending order of preferrability, in otiier words, tiie more preferred targets are listed earlier 

in Table 10. 

In a preferred embodiment tiie iRNA agent will include regions, or strands, which are 
complementary to a paur in Table 10. In a preferred embodiment tiie iRNA agent will 
include regions complementary to tiie palindromic pairs of Table 10 as a duplex region. 
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In a preferred embodiment the duplex region of the iKNA agent will target a sequence 
listed in Table 10, but will not be perfectly complementary with the target sequence, e.g., it 
will not be complementary at at least 1 base pair. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, in total, or per strand, which do not hybridize with the target sequence 

In a preferred embodiment the iRNA agent includes overhangs, e.g., 3' or 5' 
overhangs, preferably one or more 3* overhangs. Overhangs are discussed m detail 
elsewhere herein but are preferably about 2 nucleotides in length. The overhangs can be 
complementary to the gene sequences being targeted or can be other sequence. TT is a 
preferred overhang sequence. The first and second iRNA agent sequences can also be joined, 
e.g., by additional bases to form a hairpin, or by other non-base linkers. 

In another aspect, an iRNA agent can be administered to modulate blood clotting, 
e.g., to reduce the tendency to form a blood clot. In a preferred embodiment the iRNA agent 
targets Factor V expression, preferably m the liver. One or more iRNA agents can be used to 
target a wild type allele, a mutant allele, e.g., the Leiden Factor V allele, or both. Such 
administration can be used to treat or prevent venous thrombosis, e.g., deep vein thrombosis 
or puhnonary embolism, or another disorder caused by elevated or otherwise unwanted 
expression of Factor V, in, e.g., the liver. In one embodiment the iRNA agent can treat a 
subject, e.g., a human who has Factor V Leiden or other genetic trait associated widi an 
unwanted tendency to form blood clots. 

In a preferred embodiment administration of an iRNA agent which targets Factor V is 
with the administration of a second treatment, e.g, a treatment which reduces the tendency of 
the blood to clot, e.g., the admkdstration of heparin or of a low molecular weight heparin. 

In one embodiment, the iRNA agent that targets Factor V can be used as a 
prophylaxis in patients, e.g., patients with Factor V Leiden, who are placed at risk for a 
thrombosis, e.g., tliose about to undergo surgery, in particular those about to undergo high- 
risk surgical procedures known to be associated with formation of venous thrombosis, those 
about to undergo a prolonged period of relative inactivity, e.g., on a motor vehicle, train or 
airplane flight, e.g., a flight or other trip lasting more than three or five hours. Such a 
treatment can be an adjunct to the therapeutic use of low molecular weight (LMW) heparin 
prophylaxis. 
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In another embodiment, the iRNA agent that targets Factor V can be administered to 
patients with Factor V Leiden to treat deep vein thrombosis (DVT) or pulmonary embolism 
(PE). Such a treatment can be an adjunct to (or can replace) therapeutic uses of heparin or 
coumadm. The treatment can be administered by inhalation or generally by pulmonary 
routes. 

In a preferred embodunent, an iRNA agent administered to treat a liver disorder is 
targeted to the liver. For example, the iRNA agent can be complexed with a targeting 
moiety, e.g., an antibody or ligand that recognizes a liver-specific receptor. 

The invention also includes preparations, including substantially pure or 
pharmaceutically acceptable preparations of iRNA agents which silence any of the genes 
discussed herein and in particular for any of apoB-100, glucose-6-phosphatase, beta-catenin, 
factor V, or any of the HVC genes discussed herein. 

The methods and compositions of the invention, e.g., the methods and compositions 
to treat diseases and disorders of the Uver described herein, can be used with any of the iRNA 
agents described. In addition, the methods and compositions of the invention can be used for 
the treatment of any disease or disorder described herein, and for the treatment of any 
subject, e.g., any animal, any mammal, such as any hximan. 

In another aspect, the invention features, a method of selectmg two sequences or 
strands for use in an iRNA agent. The method includes: 

providing a first candidate sequence and a second candidate sequence; 
determining the value of a parameter which is a function of the number of 
palindromic pairs between the first and second sequence, wherein a palindromic pah is a 
nucleotide on said first sequence which, when the sequences are aligned in anti-parallel 
orientation, will hybridize with a nucleotide on said second sequence; 

comparing the number with a predetemuned reference value, and if the number has 
a predetermined relationship with the reference, e.g., if it is the same or greater, selectmg the 
sequences for use in an iRNA agent. In most cases each of the two sequences will be 
completely complementary with a target sequence (though as described elsewhere herein that 
may not always be the case, there may not be perfect complementarity with one or both of 
the target sequences) and will have sufficient complementarity with each other to form a 
duplex. The parameter can be derived e.g., by durectly determining the number of 
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paUndromic pairs, e.g., by inspection or by the use of a computer program which compares 
or analyses sequence. The parameter can also be determined less dkectly, and include e.g., 
calculation of or measurement of the Tm or other value related to the free energy of 

association or dissociation of a duplex. 

In a preferred embodiment the determination can be performed on a target sequence, 
e.g., a genomic sequence. In such embodiments the selected sequence is converted to its 

complement in the iRNA agent. 

In a preferred embodiment the first and second sequences are selected from the 
sequence of a single target gene. In other embodiments the first sequence is selected firom 
the sequence of a first target gene and the second sequence is selected fi»m the target of a 
second target gene. 

In a preferred embodiment the method includes comparing blocks of sequence, e.g., 
blocks which are between 15 and 25 nucleotides in length, and preferably 19, 20, or 21, and 
most preferably 19 nucleotides in length, to determine if they are suitable for use, e.g., if they 

possess sufficient palindromic pairs. 

In a preferred embodiment the first and second sequences are divided into a plurality 
of regions, e.g., terminal regions and a middle region disposed between the terminal regions 
and where in the reference value, or the predetermined relationship to the reference value, is 
different for at least two regions. E.g., the first and second sequences, when aUgned in anti- 
parallel orientation, are divided into terminal regions each of a selected number of base pairs, 
ee 2 3 4 5 or 6. and a middle region, and the reference value for the tenmnal regions is 

higher than for the middle regions. In other words, a higher number or proportion of 
palindromic pairs is requured ia the termmal regions. 

In a preferred embodiment the first and second sequences are gene sequences thus the 
complements of the sequences will be used in a iRNA agent. 

In a preferred embodiment hybridize means a classical Watson-Crick pairing. In other 
embodiments hybridize can include non- Watson-Crick paring, e.g., parings seen in micro 
RNA precursors. 

In a preferred embodiment the method mcludes the addition of nucleotides to form 
overhangs, e.g., 3' or 5' overhangs, preferably one or more 3' overhangs. Overhangs are 
discussed in detail elsewhere herein but are preferably about 2 nucleotides in length. The 
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overhangs can be complementary to the gene sequences being targeted or can be other 
sequence. TT is a preferred overhang sequence. The first and second iRNA agent sequences 
can also be joined , e.g., by additional bases to form a hairpin, or by other non-base linkers. 

hi a preferred embodiment the method is used to select all or part of a iRNA agent. 
The selected sequences can be incorporated into an iRNA agent having any architecture, e.g., 
an architecture described herem. E.g., it can be mcorporated into an iRNA agent havmg an 
overhang structure, overall length, hairpin vs. two-strand structure, as described herem. In 
addition, monomers other than naturally occurring ribonucleotides can be used in the selected 
iRNA agent. 

Preferred iRNA agents of this method will target genes expressed in the liver, e.g., 
one of the genes disclosed herein, e.g., apo B, Beta catenin, an HVC gene, or glucose 6 
phosphatase. 

In another aspect, the mvention features, an iRNA agent, determined, made, or 
selected by a method described herein. 

The methods and compositions of the mvention, e.g., the methods and iRNA 
compositions to treat liver-based diseases described herein, can be used with any dosage 
and/or formulation described herein, as weU as witii any route of administration described 

herein. 

The mvention also provides for the use of an iRNA agent which includes monomers 
which can form otiier tiian a canonical Watson-Crick pauing with another monomer, e.g., a 

monomer on another strand. 

The use of "other tiian canonical Watson-Crick pairing" between monomers of a 
duplex can be used to control, often to promote, melting of all or part of a duplex. The iRNA 
agent can mclude a monomer at a selected or constrained position that results in a first level 
of stability in the iRNA agent duplex (e.g., between ihe two separate molecules of a double 
stranded iRNA agent) and a second level of stabUity in a duplex between a sequence of an 
iRNA agent and another sequence molecule, e.g., a target or off-target sequence in a subject. 
In some cases the second duplex has a relatively greater level of stability, e.g., in a duplex 
between an anti-sense sequence of an iRNA agent and a target mRNA. In this case one or 
more of tiie monomers, the position of the monomers in tiie iRNA agent, and the target 
sequence (sometimes referred to herein as tiie selection or constraint parameters), are 
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selected such that the iKNA agent duplex is has a comparatively lower free energy of 
association (which while not wishing to be bound by mechanism or theory, is believed to 
contribute to efficacy by promoting disassociation of the duplex iRNA agent in the context of 
the RISC) while the duplex formed between an anti-sense targeting sequence and its target 
sequence, has a relatively higjier free energy of association (which while not wishing to be 
bound by mechanism or theory, is believed to contribute to efficacy by promoting association 
of the anti-sense sequence and the target RNA). 

In other cases the second duplex has a relatively lower level of stability, e.g., in a 
duplex between a sense sequence of an iRNA agent and an oflf-target mRNA. In this case 
one or more of the monomers, the position of the monomers in the iRNA agent, and an off- 
target sequence, are selected such that the iRNA agent duplex is has a comparatively higher 
free energy of association while the duplex formed between a sense targeting sequence and 
its off-target sequence, has a relatively lower free energy of association (which while not 
wishing to be bound by mechanism or theory, is believed to reduce the level of off-target 
silencing by contribute to efficacy by promoting disassociation of the duplex formed by the 
sense strand and the off-target sequence). 

Thus, inherent in the structure of the iRNA agent is the property of having a fu:st 
stability for the intra-iRNA agent duplex and a second stability for a duplex formed between 
a sequence from the iRNA agent and another RNA, e.g., a target mRNA. As discussed 
above, this can be accomphshed by judicious selection of one or more of the monomers at a 
selected or constrained position, the selection of the position in the duplex to place the 
selected or constrained position, and selection of the sequence of a target sequence (e.g., the 
particular region of a target gene which is to be targeted). The iRNA agent sequences which 
satisfy these requirements are sometimes referred herein as constrained sequences. Exercise 
of the constraint or selection parameters can be, e.g., by inspection, or by computer assisted 
methods. Exercise of the parameters can result in selection of a target sequence and of 
particular monomers to give a desired result in terms of the stability, or relative stability, of a 
duplex. 

Thus, in one aspect, the invention features, an iRNA agent which includes: a first 
sequence which targets a first target region and a second sequence which targets a second 
target region. The fiirst and second sequences have sufficient complementarity to each other 
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to hybridize, e.g., under physiological conditions, e.g., under physiological conditions but not 
in contact with a helicase or other unwmding enzyme. In a duplex region of the iRNA agent, 
at a selected or constrained position, the first target region has a first monomer, and the 
second target region has a second monomer. The first and second monomers occupy 
complementary or corresponding positions. One, and preferably both monomers are selected 
such that the stability of the pairing of the monomers contribute to a duplex between the first 
and second sequence will diflFer form the stability of the pairing between the first or second 
sequence with a target sequence. 

Usually, the monomers will be selected (selection of tiie target sequence may be 
required as well) such that they form a pairing in the iRNA agent duplex which has a lower 
free energy of dissociation, and a lower Tm, than will be possessed by the paring of the 
monomer with its complementary monomer in a duplex between the iRNA agent sequence 

and a target RNA duplex. 

The constraint placed upon the monomers can be applied at a selected site or at more 
than one selected site. By way of example, the constraint can be applied at more than 1, but 
less than 3, 4, 5, 6, or 7 sites in an iRNA agent duplex. 

A constrained or selected site can be present at a number of positions m the iRNA 
agent duplex. E.g., a constrained or selected site can be present within 3, 4, 5, or 6 positions 
from either end, 3* or 5' of a duplexed sequence. A constrained or selected site can be 
present in the middle of the duplex region, e.g., it can be more than 3, 4, 5, or 6, positions 
from the end of a duplexed region. 

The iRNA agent can be selected to target a broad spectrum of genes, including any of 

the genes described herein. 

In a preferred embodunent the iRNA agent has an architecture (architecture refers to 
one or more of overall length, length of a duplex region, the presence, number, location, or 
length of overhangs, sing strand versus double strand form) described herein. 

E.g., the iRNA agent can be less than 30 nucleotides in length, e.g., 21-23 
nucleotides. Preferably, the iRNA is 21 nucleotides in length and there is a duplex region of 
about 19 pairs. In one embodiment, the iRNA is 21 nucleotides in length, and the duplex 
region of the iRNA is 19 nucleotides. In another embodiment, the iRNA is greater than 30 
nucleotides in length. 
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In some embodiment the duplex region of the iRNA agent will have, mismatches, in 
addition to the selected or constrained site or sites. Preferably it will have no more than 1, 2, 
3, 4, or 5 bases, which do not form canonical Watson-Crick pairs or which do not hybridize. 
Overhangs are discussed in detail elsewhere herein but are preferably about 2 nucleotides in 
length. The overhangs can be complementary to the gene sequences being targeted or can be 
other sequence. TT is a preferred overhang sequence. The first and second iRNA agent 
sequences can also be joined, e.g., by additional bases to form a hairpin, or by other non-base 
linkers. 

The monomers can be selected such that: first and second monomers are naturally 
occurring ribonucleotides, or modified ribonucleotides having naturally occurring bases, and 
when occupying complementary sites either do not pair and have no substantial level of H- 
bonding, or form a non canonical Watson-Crick pairing and form a non-canonical pattern of 
H bonding, which usually have a lower free energy of dissociation than seen in a canonical 
Watson-Crick pairing, or otherwise pair to give a free energy of association which is less 
than that of a preselected value or is less, e.g., than that of a canonical pairing. When one (or 
both) of the iRNA agent sequences duplexes with a target, the first (or second) monomer 
forms a canonical Watson-Crick pairing with the base in the complementary position on the 
target, or forms a non canonical Watson-Crick pairing having a higher free energy of 
dissociation and a higher Tm than seen in the paring in the iRNA agent. The classical 
Watson-Crick parings are as follows: A-T, G-C, and A-U. Non-canonical Watson-Crick 
pairings are known in the art and can include, U-U, G-G, G-Atrans, G-Acis, and GU. 

The monomer in one or both of the sequences is selected such that, it does not pair, or 
forms a pair with its corresponding monomer in the other sequence which minimizes stability 
(e.g., the H bonding formed between the monomer at the selected site in the one sequence 
and its monomer at the corresponding site in the other sequence are less stable than the H 
bonds formed by the monomer one (or both) of the sequences with the respective target 
sequence. The monomer in one or both strands is also chosen to promote stability in one or 
both of the duplexes made by a strand and its target sequence. E.g., one or more of the 
monomers and the target sequences are selected such that at the selected or constrained 
position, there is are no H bonds formed, or a non canonical pairing is formed in the iRNA 
agent duplex, or otherwise tliey otherwise pair to give a firee energy of association which is 
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less than that of a preselected value or is less, e.g., than that of a canonical pairing, but when 
one ( or both) sequences form a duplex with the respective target, the pairing at the selected 
or constrained site is a canonical Watson-Crick pairing. 

The inclusion of such a monomers will have one or more of the following effects: it 
will destabilize the iRNA agent duplex, it will destabilize interactions between the sense 
sequence and unintended target sequences, sometimes referred to as off-target sequences, and 
duplex interactions between the a sequence and the mtended target will not be destabilized. 

By way of example: 

the monomer at the selected site in the fnst sequence includes an A (or a modified 
base which pairs with T), and the monomer in at the selected position in the second sequence 
is chosen from a monomer which will not pair or which will form a non-canonical pairing, 
e.g., G. These will be useful in applications wherein the target sequence for the first 
sequence has a T at the selected position. In embodiments where both target duplexes are 
stabilized it is useful wherein the target sequence for the second strand has a monomer which 
will form a canonical Watson-Crick pairing with the monomer selected for the selected 

position in the second strand. 

the monomer at the selected site m the first sequence includes U (or a modified base 
which pairs with A), and the monomer in at the selected position in the second sequence is 
chosen from a monomer which will not pair or which will form a non-canonical pairing, e.g., 
U or G. These will be useful in applications wherem the target sequence for the first 
sequence has a T at the selected position. In embodiments where both target duplexes are 
stabilized it is useful wherein the target sequence for the second strand has a monomer which 
will form a canonical Watson-Crick pairing with the monomer selected for the selected 

position in the second strand. 

The monomer at the selected site in the first sequence includes a G (or a modified 
base which pairs with C), and the monomer in at the selected position in the second sequence 
is chosen from a monomer which will not pair or which will form a non-canonical pairing, 
e.g., G, Acis, Atrans, or U. These will be useful m applications wherein the target sequence 
for the first sequence has a T at the selected position. In embodiments where both target 
duplexes are stabilized it is useful wherein the target sequence for the second strand has a 
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monomer which will forni a canonical Watson-Crick pairing with the monomer selected for 
the selected position in the second strand. 

The monomer at the selected site m the first sequence includes a C (or a modified 
base which pairs with G), and the monomer in at the selected position in the second sequence 
is chosen a monomer which will not pair or which will form a non-canonical pairing. These 
will be usefiil in applications wherein the target sequence for the first sequence has a T at the 
selected position. In embodiments where both target duplexes are stabilized it is usefiil 
wherein the target sequence for the second strand has a monomer which will form a 
canonical Watson-Crick pairing with the monomer selected for the selected position in the 
second strand. 

In another embodiment a non-naturally occurring or modified monomer or monomers 
are chosen such that when a non-naturally occurring or modified monomer occupies a 
positions at the selected or constrained position in an iRNA agent they exhibit a first fi'ee 
energy of dissociation and when one (or bolh) of them pairs with a naturally occurring 
monomer, the pair exhibits a second free energy of dissociation, which is usually higher than 
that of the pairing of the first and second monomers. E.g., when the first and second 
monomers occupy complementary positions they either do not pair and have no substantial 
level of H-bonding, or form a weaker bond than one of them would form with a naturally 
occurring monomer, and reduce the stability of that duplex, but when the duplex dissociates 
at least one of the strands will form a duplex with a target in which the selected monomer 
will promote stability, e.g., the monomer will form a more stable pair with a naturally 
occurring monomer in the target sequence than the pairing it formed in the iRNA agent. 

An example of such a pairmg is 2-amuio A and either of a 2-thio pyrimidine analog 
ofUorT. 

When placed in complementary positions of the iRNA agent these monomers will 
pair very poorly and will minimize stability. However, a duplex is formed between 2 amino 
A and the U of a naturally occurring target, or a duplex is between 2-thio U and the A of a 
naturally occurrmg target or 2-thio T and the A of a naturally occurring target will have a 
relatively higher free energy of dissociation and be more stable. This is shown in the FIG. 1 . 

The pair shown in FIG. 1 (the 2-amino A and the 2-s U and T) is exemplary. In 
another embodiment, the monomer at the selected position in the sense strand can be a 
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universal pairing moiety. A universal pairing agent will form some level of H bonding with 
more than one and preferably all other naturally occurring monomers. An example of a 
universal pairing moiety is a monomer which includes 3-nitro pyrrole. (Examples of other 
candidate universal base analogs can be found m the art, e.g., in Loakes, 2001, NAR 29: 
2437-2447, hereby incorporated by reference. Examples can also be found in the section on 
Universal Bases below.) In these cases the monomer at the corresponding position of the 
anti-sense strand can be chosen for its ability to form a duplex with the target and can 

mclude, e.g., A, U, G, or C. 

In another aspect, the invention features, an iRNA agent which includes: a sense 
sequence, which preferably does not target a sequence in a subject, and an anti-sense 
sequence, which targets a target gene in a subject. The sense and anti-sense sequences have 
sufficient complementarity to each other to hybridize hybridize, e.g., under physiological 
conditions, e.g., under physiological conditions but not in contact with a helicase or other 
unwmding enzyme. In a duplex region of the iRNA agent, at a selected or constrained 
position, the monomers are selected such that: ^ 

the monomer in the sense sequence is selected such that, it does not pair, or forms a 
pair with its corresponding monomer m the anti-sense strand which muiimizes stability (e.g., 
the H bonding formed between the monomer at the selected site in the sense strand and its 
monomer at the corresponding site m the anti-sense strand are less stable than the H bonds 
fonned by the monomer of the anti-sense sequence and its canonical Watson-Crick partner 
or, if the monomer in the anti-sense strand includes a modified base, the natural analog of the 
modified base and its canonical Watson-Crick partner); 

the monomer is in the corresponding position in the anti-sense strand is selected such 
that it maximizes the stability of a duplex it fonns with the target sequence, e.g., it forms a 
canonical Watson-Crick paring with the monomer in the corresponding position on the target 

stand; 

optionally, the monomer in the sense sequence is selected such that, it does not pair, 
or forms a pair with its corresponding monomer m the anti-sense strand which mmimizes 
stability with an off-target sequence. 

The inclusion of such a monomers will have one or more of the following effects: it 
will destabilize the iRNA agent duplex, it will destabilize interactions between the sense 
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sequence and unintended target sequences, sometimes referred to as ofif-target sequences, and 
duplex interactions between the anti-sense strand and the intended target will not be 
destabilized. 

The constraint placed upon the monomers can be applied at a selected site or at more 
5 than one selected site. By way of example, the constraint can be applied at more than 1, but 
less than 3, 4, 5, 6, or 7 sites in an iRNA agent duplex, 

A constrained or selected site can be present at a number of positions in the iRNA 
agent duplex. E.g., a constrained or selected site can be present within 3, 4, 5, or 6 positions 
from either end, 3' or 5' of a duplexed sequence. A constrained or selected site can be 
1 0 present in the middle of the duplex region, e.g., it can be more than 3, 4, 5, ox 6, positions 
from the end of a duplexed region. 

The iRNA agent can be selected to target a broad spectrum of genes, including any of 

the genes described herein. 

In a preferred embodiment the iRNA agent has an architecture (architecture refers to 
1 5 one or more of overall length, length of a duplex region, the presence, number, location, or 
length of overhangs, sing strand versus double strand form) described herein. 

E.g., the iRNA agent can be less than 30 nucleotides in length, e.g., 21-23 
nucleotides. Preferably, the iRNA is 21 nucleotides m length and there is a duplex region of 
about 19 pairs. In one embodiment, the iRNA is 21 nucleotides in length, and the duplex 
20 region of the iRNA is 19 nucleotides. In another embodiment, the iRNA is greater than 30 

nucleotides in length. 

In some embodiment the duplex region of the iRNA agent will have, mismatches, in 
addition to the selected or constrained site or sites. Preferably it will have no more than 1 , 2, 
3, 4, or 5 bases, which do not form canonical Watson-Crick pairs or which do not hybridize. 
25 Overhangs are discussed in detail elsewhere herein but are preferably about 2 nucleotides in 
length. The overhangs can be complementary to the gene sequences being targeted or can be 
other sequence. TT is a preferred overhang sequence. The first and second iRNA agent 
sequences can also be joined, e.g., by additional bases to form a hairpin, or by other non-base 
linkers. 

30 One or more selection or constraint parameters can be exercised such that: monomers 

at the selected site in the sense and anti-sense sequences are both naturally occurring 
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ribonucleotides, or modified ribonucleotides having naturally occurring bases, and when 
occupying complementary sites in the iRNA agent duplex either do not pair and have no 
substantial level of H-bonding, or form a non-canonical Watson-Crick pairing and thus form 
a non-canonical pattern of H bonding, which generally have a lower free energy of 
dissociation than seen in a Watson-Crick pairing, or otherwise pair to give a free energy of 
association which is less than that of a preselected value or is less, e.g., than that of a 
canonical pairing. When one, usually the anti-sense sequence of the iRNA agent sequences 
forms a duplex with another sequence, generally a sequence in the subject, and generaUy a 
target sequence, the monomer forms a classic Watson-Crick pairing with the base in the 
complementary position on the target, or forms a non-canonical Watson-Crick pairing having 
a higher free energy of dissociation and a higher Tm than seen in the paring in the iKNA 
agent. Optionally, when the other sequence of the iRNA agent, usually the sense sequences 
forms a duplex with another sequence, generally a sequence in the subject, and generally an 
ofiF-target sequence, the monomer fails to forms a canonical Watson-Crick paking with the 
base m the complementary position on the off target sequence, e.g., it forms or forms a non- 
canonical Watson-Crick paning having a lower free energy of dissociation and a lower Tm. 
By way of example: 

the monomer at the selected site in the anti-sense stand includes an A (or a modified 
base which pairs with T), the corresponding monomer in the target is a T, and the sense 
strand is chosen from a base which wUl not pair or which will form a noncanonical pair, e.g.,. 

G; 

the monomer at the selected site in the anti-sense stand includes a U (or a modified 
base v^ch pairs with A), the corresponding monomer in the target is an A, and the sense 
strand is chosen from a monomer which wiU not pair or which will form a non-canonical 

pairing, e.g., U or G; 

the monomer at tite selected site in the anti-sense stand mcludes a C (or a modified 
base which pairs with G), the corresponding monomer in the target is a G, and the sense 
strand is chosen a monomer which will not pair or which wUl form a non-canonical pairing, 

e.g., G, Acis, Atrans, or U; or 

the monomer at the selected site in the anti-sense stand includes a G (or a modified 
base which pairs with C), the corresponding monomer in the target is a C, and the sense 
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Strand is chosen from a monomer which wiU not pair or which wiU form a non-canonical 
pairing. 

In another embodiment a non-naturally occurring or modified monomer or monomers 
is chosen such that when it occupies complementary a position in an iRNA agent they exhibit 
a first free energy of dissociation and when one (or both) of them pairs with a naturally 
occurring monomer, the pah exhibits a second free energy of dissociation, which is usually 
higher than that of the pairing of the first and second monomers. E.g., when the first and 
second monomers occupy complementary positions they either do not pan: and have no 
substantial level of H-bondmg, or form a weaker bond than one of tiiem would form with a 
naturally occurring monomer, and reduce the stability of that duplex, but when the duplex 
dissociates at least one of the strands will form a duplex with a target in which the selected 
monomer will promote stability, e.g., the monomer wiU form a more stable pair with a 
naturally occurring monomer m the target sequence than the pahing it formed in the iRNA 
agent. 

An example of such a pairing is 2-amino A and either of a 2-thio pyrimidme analog 
of U or T. As is discussed above, when placed in complementary positions of the iRNA 
agent these monomers will pair very poorly and wiU mmunize stability. However, a duplex 
is formed between 2 ammo A and the U of a naturally occurring target, or a duplex is formed 
between 2-thio U and the A of a naturally occurring target or 2-thio T and the A of a 
naturally occurring target will have a relatively higher free energy of dissociation and be 
more stable. 

The monomer at the selected position in the sense strand can be a universal pairing 
moiety. A universal pairing agent will form some level of H bonding with more than one and 
preferably all other naturally occurring monomers. An examples of a universal pairing 
moiety is a monomer which uicludes 3-nitro pyrrole. Examples of other candidate universal 
base analogs can be found m the art, e.g., in Loakes, 2001, NAR 29: 2437-2447, hereby 
mcorporated by reference, hi these cases the monomer at the conesponding position of the 
anti-sense strand can be chosen for its ability to form a duplex with the target and can 

include, e.g.. A, U, G, or C. 

hi another aspect, the invention features, an iRNA agent which mcludes: 
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a sense sequence, which preferably does not target a sequence in a subject, and an anti-sense 
sequence, which targets a plurality of target sequences in a subject, wherein the targets differ 
m sequence at only 1 or a small number, e.g., no more than 5, 4, 3 or 2 positions. The sense 
and anti-sense sequences have sufficient complementarity to each other to hybridize, e.g., 
under physiological conditions, e.g., under physiological conditions but not in contact with a 
helicase or other unwinding enzyme. In the sequence of the anti-sense strand of the iRNA 
agent is selected such that at one, some, or all of the positions which correspond to positions 
that differ in sequence between the target sequences, the anti-sense strand will include a 
monomer which will form H-bonds with at least two different target sequences. In a 
preferred example the anti-sense sequence will include a universal or promiscuous monomer, 
e.g., a monomer which includes 5-nitro pyrrole, 2-amino A, 2-thio U or 2-thio T, or other 
universal base referred to herein. 

In a preferred embodunent the iRNA agent targets repeated sequences (which differ 
at only one or a small number of positions from each other) in a single gene, a plurality of 
genes, or a viral genome, e.g., the HCV genome. 

An embodiment is illustrated in the FIGs. 2 and 3. 

In another aspect, the invention features, determining, e.g., by measurement or 
calcxilation, the stability of a pairing between monomers at a selected or constrained position 
in the iRNA agent duplex, and preferably determining the stability for the corresponding 
pairing in a duplex between a sequence form the iRNA agent and another RNA, e.g., a target 
sequence. The determinations can be compared, An iRNA agent thus analyzed can be used 
in the development of a further modified iRNA agent or can be administered to a subject. 
This analysis can be performed successively to refine or design optimized iRNA agents. 

In another aspect, the invention features, a kit which includes one or more of the 
following an iRNA described herein, a sterile container in which the iRNA agent is 
disclosed, and instructions for use. 

In another aspect, the invention features, an iRNA agent containing a constrained 
sequence made by a method described herem. The iRNA agent can target one or more of the 

genes referred to herein. 

iRNA agents havmg constrained or selected sites, e.g., as described herein, can be 
used in any way described herein. Accordingly, they iRNA agents having constrained or 
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selected sites, e.g., as described herein, can be used to silence a target, e.g., in any of the 
methods described herein and to target any of the genes described herein or to treat any of the 
disorders described herein. iRNA agents having constrained or selected sites, e.g., as 
described herein, can be incorporated into any of the formulations or preparations, e.g., 
pharmaceutical or sterile preparations described herein. iRNA agents having constrained or 
selected sites, e.g., as described herein, can be administered by any of the routes of 
administration described herein. 

The term "other than canonical Watson-Crick pairing'' as used herein, refers to a 
pairing between a JBrst monomer in a first sequence and a second monomer at the 
corresponding position in a second sequence of a duplex in vsrhich one or more of the 
following is true: (1) there is essentially no pairing between the two, e.g., there is no 
significant level of H bonding between the monomers or binding between the monomers 
does not contribute in any significant way to the stability of the duplex; (2) the monomers are 
a non-canonical paring of monomers having a naturally occurring bases, i.e., they are other 
than A-T, A-U, or G-C, and they form monomer-monomer H bonds, although generally the 
H bonding pattern formed is less strong than the bonds formed by a canonical pairing; or (3) 
at least one of the monomers includes a non-naturally occurring bases and the H bonds 
formed between the monomers is, preferably formed is less strong than the bonds formed by 
a canonical pairing, namely one or more of A-T, A-U, G-C. 

The term "off-target" as used herein, refers to a sequence other than the sequence to 

be silenced. 

Universal Bases: "wild-cards" ; shape-based complementarity 

Bi-stranded, multisite replication of a base pair between difluorotoluene and adenine: confirmation by 
•inverse' sequencing. Liu, D.; Moran, S.; Kool, E. T. Chem, Biol, 1997, 4, 919-926) 
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(Importance of terminal base pair hydrogen-bonding in 3*-end proofreading by the Klenow fragment 
of DNA polymerase 1. Morales, J. C; Kool. E. T. Biochemistry, 2000, 39, 2626-2632) 



(Selective and stable DNA base pairing without hydrogen bonds, Matray, T, J.; Kool, E. T. J. Ain, 
Chem. Soc, 1998, J20, 6191-6192) 




(Difluorotoluene, a nonpolar isostere for thymine, codes specifically and efficiently for adenine in 
DNA replication. Moran. S. Ren, R. X.-F.; Rumney IV, S.; Kool. E. T. J. Am. Chem. Soc, 1997, / 19, 2056- 
2057) 



31 



wo 2004/080406 



PCT/US2004/007070 




(Structure and base pairing properties of a replicable nonpolar isostere for deoxyadenosine. Guckian, 
K. M.; Morales, J. C; Kool, E. T. / Org, Chem,, 1998, 63, 9652-9656) 
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N02 




MICS PIM 6MICS 

( 

(Universal bases for hybridization, replication and chain termination. Berger, M.; Wu. Y.; Ogawa, 
5 K.; McMinn. D. L.; Schultz, P.G.; Romesberg, F. E. Nucleic Acids Res., 2000, 28, 291 1-2914) 




(1 . Efforts toward the expansion of the genetic alphabet: Information storage and replication with unnatural 
hydrophobic base pairs. Ogawa, A. K.; Wu. Y.; McMinn, D. L.; Liu, J.; Schultz, P. G.; Romesberg, F. E. J 
Am. Chem. Soc, 2000, 722. 3274-3287. 2. Rational design of an unnatural base pair with increased kinetic 
selectivity. Ogawa, A. K.; Wu. Y.; Berger, M.; Schultz, P. G.; Romesberg, F. E. J, Am, Chem. Soc, 2000, 
J22, 8803-8804) 
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7AI 



(Efforts toward expansion of the genetic alphabet: replication of DNA with three base pairs. Tae, E. L.; 

Wu, Y.; Xia, G.; Schultz, P. G.; Romesberg, F. E. J. Am. Chem. Soc,, 2001, 123, 7439-7440) 
5 (1 . Efforts toward expansion of the genetic alphabet: Optimization of interbase hydrophobic 

interactions. Wu. Y.; Ogawa, A. K.; Berger, M.; McMinn, D. L.; Schultz, P. G,; Romesberg, F. E. J. Am. Chem. 

Soc, 2000, 122, im-mi. 2. Efforts toward expansion of genetic alphabet: DNA polymerase recognition of a 

highly stable, self-pairing hydrophobic base. McMinn, D. L.; Ogawa. A. K.; Wu, Y.; Liu, J.; Schultz. P. G.; 

Romesberg, F. E. J. Am. Chem. Soc, 1999, 121, 1 1585-1 1586) 
1 0 (A stable DNA duplex containing a non-hydrogen-bonding and non-shape complementary base 

couple: Interstrand stacking as the stability determining factor. Brotschi, C; Haberli, A.; Leumann, C, J. Angew. 

Chem. Int. Ed., 200X, 40, 3012-3014) 

(2,2'-Bipyridine Ligandoside: A novel building block for modifying DNA with intra-dupiex metal 
15 complexes. Weizman, H.; Tor, Y. J. Am. Chem. Soc, 2001, 123, 3375-3376) 

NH2 

OH 
d2APm 




(Minor groove hydration is critical to the stability of DNA duplexes. Lan, T.; McLaughlin, L. W. J. 
Am. Chem. 5oc..2000, 122, 6512-13) 
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NO2 



OH 

(Effect of the Universal base 3-nitropyrroIe on the selectivity of neighboring natural bases. Oliver, J. 
S.; Parker. K. A.; Suggs, J. W. Organic Lett., 2001, i, 1977-1980. 2. Effect of the l-(2*-deoxy-P-D- 
ribofuranosyl)-3-nitropyrrol residue on the stability of DNA duplexes and triplexes. Amosova, O.; George J.; 
Fresco, J. R. Nucleic Acids Res., 1997, 25, 1930-1934. 3. Synthesis, structure and deoxyribonucleic acid 
sequencing with a universal nucleosides: l.{2*-deoxy-P-D-ribofuranosyl)-3-nitropyrrole. Bergstrom, D. E.; 
Zhang, P.; Toma, P. H.; Andrews, P. C; Nichols. R. J. Am. Chem. Soc, 1995. 777, 120M209) 



( 



OH 



^— HniiQ y^O 




/ \ N-H N ^ 




HO 

<x 



OH 



(Model studies directed toward a general triplex DNA recognition scheme: a novel DNA base that 
binds a CG base-pair in an organic solvent. Zimmerman, S. C; Schmitt, P. J. Am. Chem. Soc, 1995, 777, 
10769-10770) 




DNA 



(A universal, photocleavable DNA base: nitropiperonyl 2'-deoxyriboside. /. Org. Chem., 2001, 66, 
2067-2071) 
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N 

(Recognition of a single guanine bulge by 2-acylamino-1.8-naphthyridine. Nakatani, K.; Sando, S.; 
Saito. L J. Am. Chem. Soc, 2000, 122, 2X12-1X11, b. Specific binding of 2-aniino.l.8-naphthyridine into single 
guanine bulge as evidenced by photooxidation of GC doublet, Nakatani, K.; Sando, S.; Yoshida, K.; Saito, I. 
Bioorg. Med, Chem. Lett, 2001, 7/. 335-337) 
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Other universal bases can have the following formulas; 









and 




wherein: 

Q is N or CR'**; 

Q' is N or CR''^ 

Q" is N or CR"'; 

Q'"isNorCR'*'; 

Q'^ is N or CK^; 
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R** is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR^ or NR''R% Cp 
Cfi alkyl, Ce-Cio aryl, Ce-Cio heteroaryl, Ca-Cg heterocyclyl, or when taken together with R'*' 
forms -OCH2O-; 

r"' is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, ^IHR^ or NR''R% Ci- 
C6 alkyl, C6-C10 aryl, Ce-Cio heteroaryl, Cs-Cg heterocyclyl, or when taken together with R"" 
or R"** forms -OCH2O-; 

R"* is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR*", or ^IR'^l^ Ci- 
C6 alkyl, C6-C10 aryl, Ce-Cio heteroaryl, Ca-Cg heterocyclyl, or when taken together with R 

or R'" forms -OCH2OS 

R*' is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, ^IHR^ or NR''R', Ci- 
C6 alkyl, Cfi-Cio aryl, Ce-Cio heteroaryl, Ca-Cg heterocyclyl, or when taken together with R'** 
or R"* forms -0CH20-; 

I be 

R"*^ is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR , or NR R , Cp 
C6 alkyl, C6-C10 aryl, Cg-Cio heteroaryl, Ca-Cg heterocyclyl, or when taken together with R 
forms -0CH20-; 

R'*' K'\ R", R". R'*, R", R^^ R^, R", R", R"", R''. R'', R'', R'*. R®. 
R'°, R", and R'^ are each independently selected from hydrogen, halo, hydroxy, nitro, 
protected hydroxy, NH2, NHR\ or NR''R*, Ci-Ce alkyl, C2-C6 alkynyl, Ce-Cjo aryl, Ce-Cio 
heteroaryl, Ca-Cg heterocyclyl, NC(0)R", orNC(0)R"; 

R" is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR", or NR''R^ Ci- 
Cfi alkyl, C2-C6 alkynyl, Ce-Cio aryl, Ce-Cio heteroaryl, Ca-Cg heterocyclyl, NC(0)R", or 
NC(0)R°, or when taken together with R^^ forms a fused aromatic ring which may be 
optionally substituted; 

b be 

R^^ is hydrogen, halo, hydroxy, nitro, protected hydroxy, NH2, NHR , or NR R , Cr 
C6 alkyl, C2-C6 alkynyl, Ce-Cio aryl, Ce-Cio heteroaryl, Ca-Cg heterocyclyl, NC(0)R*\ or 
NC(0)R°, or when taken together with R^^ forms a fused aromatic ring which may be 
optionally substituted; 

R^^ is halo, NH2, NHR^ or NR^'; 

R^ is Ci-Ce alkyl or a nitrogen protecting group; 

R' is Ci-Ce alkyl; and 
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R" is alkyl optionally substituted with halo, hydroxy, nitro, protected hydroxy, NH2, 
^IHR^ orNR'll^ Ci-Cfi alkyl, Cj-Ce alkynyl, Cs-Cio aryl, Ce-Cjo heteroaryl, Ca-Cj 
heterocyclyl, NC(0)R'^ or NC(0)R^ 

Examples of universal bases include: 
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In one aspect, the invention features methods of producing iRNA agents, e.g., sRNA 
agents, e,g. an sRNA agent described herein, having the ability to mediate RNAi. These 
iRNA agents can be formulated for admmistration to a subject. 

In another aspect, the invention features a method of administering an iRNA agent, 
e.g,, a double-stranded iRNA agent, or sRNA agent, to a subject (e.g., a human subject). The 
method includes administering a unit dose of the iRNA agent, e.g., a sRNA agent, e.g., 
double stranded sRNA agent that (a) the double-stranded part is 19-25 nucleotides (nt) long, 
preferably 21-23 nt, (b) is complementary to a target RNA (e.g., an endogenous or pathogen 
target RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nucleotide long. In 
one embodiment, the unit dose is less than 1.4 mg per kg of bodyweight, or less than 10, 5, 2, 
1, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, 0.00005 or 0.00001 mg per kg of 
bodyweight, and less than 200 nmole of RNA agent (e.g. about 4.4 x 10^^ copies) per kg of 
bodyweight, or less than 1500, 750, 300, 150, 75, 15, 7.5, 1.5, 0.75, 0.15, 0.075, 0.015, 
0.0075, 0.0015, 0.00075, 0.00015 nmole of RNA agent per kg of bodyweight. 

The defined amount can be an amount effective to treat or prevent a disease or 
disorder, e.g., a disease or disorder associated with the target RNA. The unit dose, for 
example, can be administered by injection (e.g., intravenous or intramuscular), an inhaled 
dose, or a topical application. Particularly preferred dosages are less than 2, 1, or 0.1 mg/kg 
of body weight. 

In a preferred embodiment, the unit dose is administered less frequently than once a 
day, e.g., less than every 2, 4, 8 or 30 days. In another embodiment, the unit dose is not 
administered with a frequency (e.g., not a regular frequency). For example, the unit dose 
may be administered a single time. 

In one embodiment, the effective dose is administered with other traditional 
therapeutic modalities. In one embodiment, tiie subject has a viral infection and the modality 
is an antiviral agent other than an iRNA agent, e.g., other than a double-stranded iRNA 
agent, or sRNA agent. In another embodunent, the subject has atherosclerosis and the 
effective dose of an iRNA agent, e.g, a double-stranded iRNA agent, or sRNA agent, is 
administered in combination with, e.g., after surgical intervention, e.g., angioplasty. 

In one embodiment, a subject is administered an initial dose and one or more 
maintenance doses of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
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(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof). The maintenance dose or doses are generally lower than the uutial dose, 
e.g., one-half less of the initial dose. A mamtenance regimen can include treating the subject 
with a dose or doses ranging from 0.01 |xg to 1.4 mg/kg of body weight per day, e.g., 10, 1, 
0.1, 0.01, 0.001, or 0.00001 mg per kg of bodyweight per day. The maintenance doses are 
preferably administered no more than once every 5, 10, or 30 days. 

In one embodiment, the iRNA agent pharmaceutical composition includes a plurality 
of iRNA agent species. In another embodiment, the iRNA agent species has sequences that 
are non-overlapping and non-adjacent to another species with respect to a naturally occurring 
target sequence. In another embodiment, the plurality of iRNA agent species is specific for 
different naturally occurring target genes. In another embodiment, the iRNA agent is allele 
specific. 

The inventors have discovered that iRNA agents described herein can be administered 
to mammals, particularly large mammals such as nonhuman primates or humans in a number 
of ways. 

In one embodiment, the administration of the iRNA agent, e.g., a double-stranded 
iRNA agent, or sRNA agent, composition is parenteral, e.g. intravenous (e.g., as a bolus or as 
a diflusible infusion), intradermal, intraperitoneal, intramuscular, intrathecal, intraventricular, 
intracranial, subcutaneous, transmucosal, buccal, sublingual, endoscopic, rectal, oral, vaginal, 
topical, puhnonary, intranasal, urethral or ocular. Administration can be provided by the 
subject or by another person, e.g., a health care provider. The medication can be provided in 
measured doses or in a dispenser that delivers a metered dose. Selected modes of delivery 
are discussed in more detail below. 

The invention provides methods, compositions, and kits, for rectal administration or 

delivery of iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes a an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
or precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA 
agent described herein, e.g., a iRNA agent having a double stranded region of less than 40, 
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and preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3* 
overhangs can be administered rectally, e.g., introduced through the rectum into the lower or 
upper colon. This approach is particularly useful in the treatment of, inflammatory disorders, 
disorders characterized by unwanted cell proliferation, e.g., polyps, or colon cancer. 

In some embodiments the medication is delivered to a site in the colon by introducing 
a dispensing device, e.g., a flexible, camera-guided device similar to that used for inspection 
of the colon or removal of polyps, which includes means for delivery of the medication. 

In one embodiment, the rectal administration of the iKNA agent is by means of an 
enema. The iRNA agent of the enema can be dissolved in a saline or buffered solution. 

In another embodiment, the rectal administration is by means of a suppository. The 
suppository can include other ingredients, e.g., an excipient, e.g., cocoa butter or 
hydropropylmethylcellulose. 

The invention also provides methods, compositions, and kits for oral delivery of 

iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA 
described herein, e.g., a iRNA agent having a double stranded region of less than 40 and 
preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3' 
overhangs can be administered orally. 

Oral administration can be in the form of tablets, capsules, gel capsules, lozenges, 
troches or liquid syrups. In a preferred embodiment the composition is applied topically to a 

surface of the oral cavity. 

Tlie invention also provides methods, compositions, and kits for buccal delivery of 

iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amount of iRNA agent 
having a double stranded region of less tiian 40 and preferably less tiian 30 nucleotides and 
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having one or two 1-3 nucleotide single strand 3* overhangs can be administered to the buccal 
cavity. The medication can be sprayed into the buccal cavity or applied directly, e.g., in a 
liquid, solid, or gel form to a surface in the buccal cavity. This administration is particularly 
desirable for the treatment of inflammations of the buccal cavity, e.g., the gums or tongue, 
e.g., in one embodiment, the buccal administration is by spraying into tiie cavity, e.g., 
without inhalation, from a dispenser, e.g., a metered dose spray dispenser that dispenses the 
pharmaceutical composition and a propellant 

The invention also provides methods, compositions, and kits for ocular delivery of 

iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA agent 
described herein, e.g., a sRNA agent having a double stranded region of less than 40 and 
preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3' 
overhangs can be administered to ocular tissue. 

The medications can be applied to the surface of the eye or nearby tissue, e.g., the 
inside of the eyelid. It can be applied topically, e.g., by spraying, m drops, as an eyewash, or 
an ointment Administration can be provided by the subject or by another person, e.g., a 
health care provider. The medication can be provided in measured doses or in a dispenser 

that delivers a metered dose. 

The medication can also be administered to the mterior of the eye, and can be 
introduced by a needle or other delivery device which can introduce it to a selected area or 
structure. 

Ocular treatment is particularly desirable for treating inflanomation of the eye or 
nearby tissue. 

The invention also provides methods, compositions, and kits for delivery of iRNA 
agents described herein to or through the skin. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
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precursor thereof) described herein, eg., a therapeutically effective amount of a iRNA agent 
described herein, e.g., a sKNA agent havmg a double stranded region of less than 40 and 
preferably less than 30 nucleotides and one or two 1-3 nucleotide smgle strand 3' overhangs 
can be administered directly to the skin. 

The medication can be applied topically or delivered in a layer of the skin, e.g., by the 
use of a microneedle or a battery of microneedles which penetrate into the skin, but 
preferably not into the underlying muscle tissue. 

In one embodiment, the administration of the iRNA agent composition is topical. In 
another embodiment, topical administration delivers the composition to the dermis or 
epidermis of a subject In other embodiments the topical administration is in the form of 
transdermal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids or 
powders. A composition for topical admmistration can be formulated as a liposome, micelle, 
emulsion, or other lipophilic molecular assembly. 

In another embodiment, the transdermal administration is applied with at least one 
penetration enhancer. In other embodiments, the penetration can be enhanced with 
iontophoresis, phonophoresis, and sonophoresis. In another aspect, the invention provides 
methods, compositions, devices, and kits for puhnonary delivery of iRNA agents described 
herein. 

Accordingly, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) described herein, e.g., a therapeutically effective amount of iRNA agent, 
e.g., a sRNA agent having a double stranded region of less than 40, preferably less than 30 
nucleotides and having one or two 1-3 nucleotide single strand 3* overhangs can be 
administered to the pulmonary system. Puhnonary administration can be achieved by 
mhalation or by the introduction of a delivery device into the pulmonary system, e.g., by 
introducing a delivery device which can dispense the medication. 

The preferred method of pulmonary delivery is by inhalation. The medication can be 

provided in a dispenser which delivers the medication, e.g., wet or dry, in a form sufficiently 
small such that it can be inhaled. The device can deliver a metered dose of medication. The 
subject, or another person, can administer the medication. 
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Pulmonary delivery is effective not only for disorders which directly affect 
pulmonary tissue, but also for disorders which affect other tissue. 

IRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or 

aerosol for pulmonary delivery. 

In another aspect, the invention provides methods, compositions, devices, and kits for 
nasal delivery of iRNA agents described herein. Accordingly, an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) described herein, e.g., a 
therapeutically effective amount of iRNA agent, e.g., a sRNA agent having a double stranded 
region of less than 40 and preferably less than 30 nucleotides and having one or two 1-3 
nucleotide single strand 3* overhangs can be administered nasally. Nasal administration can 
be achieved by introduction of a delivery device into the nose, e.g., by introducing a delivery 
device which can dispense the medication. 

The preferred method of nasal delivery is by spray, aerosol, liquid, e.g., by drops, of 
by topical administration to a surface of the nasal cavity. The medication can be provided in 
a dispenser which delivery of the medication, e.g., wet or dry, in a form sufficiently small 
such that it can be inhaled. The device can deliver a metered dose of medication. The 
subject, or another person, can administer the medication. 

Nasal delivery is effective not only for disorders which directly affect nasal tissue, but 
also for disorders which affect other tissue 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or for 

nasal delivery. 

In another embodiment, the iRNA agent is packaged in a viral natural capsid or in a 
chemically or enzymatically produced artificial capsid or structure derived therefrom. 

In one aspect, of the invention, the dosage of a pharmaceutical composition including 
a iRNA agent is administered in order to alleviate the symptoms of a disease state, e.g., 
cancer or a cardiovascular disease. 

In another aspect, gene expression in a subject is modulated by administering a 
pharmaceutical composition including a iRNA agent. In other embodiments, a subject is 
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treated with the pharmaceutical composition by any of the methods mentioned above. In 
another embodiment, the subject has cancer. 

An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sKNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) composition can be administered as a liposome. For example, the 
composition can be prepared by a method that includes: (1) contacting a iRNA agent with an 
amphipathic cationic lipid conjugate in the presence of a detergent; and (2) removing the 
detergent to form a iRNA agent and cationic lipid complex. In one embodiment, the 
detergent is cholate, deoxycholate, lauryl sarcosine, octanoyl sucrose, CHAPS (3-[(3- 
cholamidopropyl)-di-methylamine]-2-hydroxyl-l-propane),novel-p-D-glucopyranoside, 

lauryl dimethylamine oxide, or octylglucoside. The iRNA agent can be an sRNA agent. The 
method can include preparing a composition that includes a plurality of iRNA agents, e.g., 
specific for one or more different endogenous target RNAs. The method can include other 
features described herein. 

In another aspect, a subject is treated by administering a defined amount of an iRNA 
agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger 
iRNA agent which can be processed into a sRNA agent) composition that is in a powdered 
form. In one embodiment, the powder is a collection of microparticles. In one embodiment, 
the powder is a collection of crystalline particles. The composition can include a plurality of 
iRNA agents, e.g,, specific for one or more different endogenous target RNAs. The method 
can include other features described herein. 

In one aspect, a subject is treated by administering a defined amount of a iRNA agent 
composition that is prepared by a method that includes spray-drying, i.e. atomizing a liquid 
solution, emulsion, or suspension, immediately exposing the droplets to a drying gas, and 
collecting the resulting porous powder particles. The composition can include a plurality of 
iRNA agents, e.g., specific for one or more different endogenous target RNAs. The method 
can include other features described herein. 

In one aspect, the iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
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precursor thereof), is provided in a powdered, crystallized or other finely divided form, with 
or without a carrier, e.g., a micro- or nano-particle suitable for inhalation or other puhnonary 
delivery. In one embodiment, this includes providing an aerosol preparation, e.g., an 
aerosolized spray-dried composition. The aerosol composition can be provided in and/or 
dispensed by a metered dose delivery device. 

In another aspect, a subject is treated for a condition treatable by inhalation. In one 
embodiment, this method includes aerosolizing a spray-dried iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) composition and inhaling the 
aerosolized composition. The iRNA agent can be an sRNA. The composition can include a 
plurality of iRNA agents, e.g., specific for one or more different endogenous target RNAs. 
The method can include other features described herein. 

In another aspect, the invention features a method of treating a subject that includes: 
administering a composition including an effective/defined amount of an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof), wherein the composition 
is prepared by a method that includes spray-drying, lyophilization, vacuum drying, 
evaporation, fluid bed drying, or a combination of these techniques 

In another aspect, the invention features a method that includes: evaluating a 
parameter related to the abundance of a transcript in a cell of a subject; comparing the 
evaluated parameter to a reference value; and if the evaluated parameter has a preselected 
relationship to the reference value (e.g., it is greater), administering a iRNA agent (or a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes a iRNA agent or precursor thereof) to the subject. In one embodiment, the 
iRNA agent includes a sequence that is complementary to the evaluated transcript. For 
example, the parameter can be a direct measure of transcript levels, a measure of a protein 
level, a disease or disorder symptom or characterization (e.g., rate of cell proliferation and/or 
tumor mass, viral load,) 
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In another aspect, the invention features a method that includes: administering a first 
amoimt of a composition that comprises an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, or precursor thereof) to a subject, wherein the iRNA agent includes a strand 
substantially complementary to a target nucleic acid; evaluating an activity associated with a 
protein encoded by the target nucleic acid; wherein the evaluation is used to detennme if a 
second amount should be administered. In a preferred embodiment the method includes 
administering a second amount of the composition, wherein the tuning of administration or 
dosage of the second amount is a function of the evaluating. The method can include other 

features described herein. 

In another aspect, the invention features a method of administering a source of a 
double-stranded iRNA agent (ds iRNA agent) to a subject. The method includes 
admmistering or implanting a source of a ds iRNA agent, e.g., a sRNA agent, that (a) 
includes a double-stranded region that is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to a target RNA (e.g., an endogenous RNA or a pathogen 
RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nt long. In one embodiment, 
the source releases ds iRNA agent over time, e.g. the source is a controlled or a slow release 
source, e.g., a microparticle that gradually releases the ds iRNA agent. In another 
embodiment, the source is a pump, e.g., a pump that includes a sensor or a pump that can 

release one or more unit doses. 

In one aspect, the invention features a pharmaceutical composition that includes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) 
including a nucleotide sequence complementary to a target RNA, e.g., substantially and/or 
exactly complementary. Hie target RNA can be a transcript of an endogenous human gene. 
In one embodiment, the iRNA agent (a) is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to an endogenous target RNA, and, optionally, (c) includes 
at least one 3' overhang 1-5 nt long. In one embodiment, the pharmaceutical composition can 
be an emiUsion, microemulsion, cream, jelly, or liposome. 
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In one example the pharmaceutical composition includes an iRNA agent mixed with a 
topical delivery agent. The topical delivery agent can be a plurality of microscopic vesicles. 
The microscopic vesicles can be liposomes. In a preferred embodiment the liposomes are 
cationic liposomes. 

In another aspect, the pharmaceutical composition mcludes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof) admixed with a topical 
penetration enhancer. In one embodiment, the topical penetration enhancer is a fatty acid. 
The fatty acid can be arachidonic acid, oleic acid, lauric acid, caprylic acid, capric acid, 
myristic acid, pahnitic acid, stearic acid, linoleic acid, linolenic acid, dicaprate, tricaprate, 
monolein, dilaurin, glyceryl 1-monocaprate, l-dodecylazacycloheptan-2-one, an 
acylcamitme, an acylcholine, or a Cmo alkyl ester, monoglyceride, diglyceride or 
pharmaceutically acceptable salt thereof. 

In another embodiment, the topical penetration enhancer is a bile salt The bile salt 
can be cholic acid, dehydrocholic acid, deoxychoUc acid, gluchohc acid, glycholic acid, 
glycodeoxycholic acid, taurocholic acid, taurodeoxycholic acid, chenodeoxycholic acid, 
ursodeoxycholic acid, sodium tauro-24,25-dihydro-fusidate, sodium glycodihydrofiisidate, 
polyoxyethylene-9-lauryl ether or a pharmaceutically acceptable salt thereof 

In another embodiment, the penetration enhancer is a chelating agent The chelating 
agent can be EDTA, citric acid, a salicyclate, a N-acyl derivative of collagen, laureth-9, an 
N-amino acyl derivative of a beta-diketone or a mixture thereof 

In another embodiment, the penetration enhancer is a surfactant, e.g., an ionic or 
nonionic surfactant. The surfactant can be sodium lauryl sulfate, polyoxyethylene-9-lauryl 
ether, polyoxyethylene-20-cetyl ether, a perfluorchemical emulsion or mixture thereof 

In another embodiment, the penetration enhancer can be selected from a group 
consisting of unsaturated cyclic ureas, 1-alkyl-alkones, 1-alkenylazacyclo-alakanones, 
steroidal anti-inflammatory agents and mixtures thereof In yet another embodiment the 
penetration enhancer can be a glycol, a pyrrol, an azone, or a terpenes. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded IRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
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larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sKNA agent, or precursor thereof) in a 
form suitable for oral delivery. In one embodiment, oral delivery can be used to deliver an 
iRNA agent composition to a cell or a region of the gastro-intestinal tract, e.g., small 
intestine, colon {e.g., to tieat a colon cancer), and so forth. The oral delivery form can be 
tablets, capsules or gel capsules. In one embodiment, the iRNA agent of the pharmaceutical 
composition modulates expression of a cellular adhesion protein, modulates a rate of cellular 
proliferation, or has biological activity against eukaryotic pathogens or retroviruses. In 
another embodiment, the pharmaceutical composition includes an enteric material that 
substantially prevents dissolution of the tablets, capsules or gel capsules in a mammalian 
stomach. In a preferred embodiment the enteric material is a coating. The coating can be 
acetate phthalate, propylene glycol, sorbitan monoleate, cellulose acetate trimellitate, 
hydroxy propyl methylcellulose phthalate or cellulose acetate phthalate. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a penetration enhancer. The penetration enhancer can be a bile salt or a fatty acid. 
The bile salt can be ursodeoxycholic acid, chenodeoxycholic acid, and salts thereof. The 
fatty acid can be capric acid, lauric acid, and salts thereof. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycol. In another 
example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or triethyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent and a delivery vehicle. In one embodiment, the iRNA agent is (a) is 19-25 
nucleotides long, preferably 21-23 nucleotides, (b) is complementary to an endogenous target 
RNA, and, optionally, (c) includes at least one 3* overhang 1-5 nucleotides long. 

In one embodiment, the delivery vehicle can deliver an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) to a cell by a topical route of 
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administration. The delivery vehicle can be microscopic vesicles. In one example the 
microscopic vesicles are liposomes. In a preferred embodiment the liposomes are cationic 
liposomes. In another example the microscopic vesicles are micelles. 

In one aspect, the invention features a mettiod for making a pharmaceutical 
composition, the method including: (1) contacting an iRNA agent, e.g., a double-stranded 
iKNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iKNA agent which can be 
processed into a sRNA agent) with a amphipathic cationic Upid conjugate in the presence of 
a detergent; and (2) removing the detergent to form a iRNA agent and cationic lipid complex. 

In another aspect, the mvention features a pharmaceutical composition produced by a 
method including: (1) contacting an iRNA agent, e.g., a double-stranded iRNA agent, or 
sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent) with a amphipathic cationic lipid conjugate in the presence of a detergent; and 
(2) removing the detergent to form a iRNA agent and cationic lipid complex. Li one 
embodiment, the detergent is cholate, deoxycholate, lauryl sarcosine, octanoyl sucrose, 
CHAPS (3-[(3-cholamidopropyl)-di-methylamine]-2-hydroxyl-l-propane),novel-p-D- 

glucopyranoside, lauryl dimethylamine oxide, or octylglucoside. In another embodiment, the 
amphipathic cationic lipid conjugate is biodegradable. In yet another embodiment the 
pharmaceutical composition includes a targeting ligand. 

In one aspect, the invention features a pharmaceutical composition mcluding an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in an 
injectable dosage form. In one embodiment, the injectable dosage form of the 
pharmaceutical composition includes sterile aqueous solutions or dispersions and sterile 
powders. In a preferred embodiment the sterile solution can include a diluent such as water; 
saline solution; fixed oils, polyethylene glycols, glycerm, or propylene glycol. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in 
oral dosage form. In one embodiment, the oral dosage form is selected from the group 
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consisting of tablets, capsules and gel capsules. In another embodiment, the pharmaceutical 
composition includes an enteric material that substantially prevents dissolution of the tablets, 
capsules or gel capsules in a mammalian stomach. In a prefened embodiment the enteric 
material is a coating. The coating can be acetate phthalate, propylene glycol, sorbitan 
monoleate, cellulose acetate trimellitate, hydroxy propyl methyl cellulose phthalate or 
cellulose acetate phthalate. In one embodiment, the oral dosage form of the pharmaceutical 
composition includes a penetration enhancer, e.g., a penetration enhancer described herein. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycol. In anotiier 
example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be dietiiyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or trietiiyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sKNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor tiiereof) in a 
rectal dosage form. In one embodiment, tiie rectal dosage form is an enema. In anotiier 
embodiment, the rectal dosage form is a suppository. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
vaginal dosage form. In one embodiment, the vaginal dosage form is a suppository. In 
another embodiment, the vaginal dosage form is a foam, cream, or gel. 

In one aspect, tiixe invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed mto a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor tiiereof) in a 
pulmonary or nasal dosage form. In one embodiment, the iRNA agent is incorpomted into a 
particle, e.g., a macroparticle, e.g., a microsphere. The particle can be produced by spray 
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drying, lyophilization, evaporation, fluid bed drying, vacuum drying, or a combination 
thereof. The microsphere can be formulated as a suspension, a powder, or an implantable 
solid. 

In one aspect, the invention features a spray-dried iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) composition suitable for 
inhalation by a subject, including: (a) a therapeutically effective amount of a iRNA agent 
suitable for treating a condition in tiie subject by inhalation; (b) a pharmaceutically 
acceptable excipient selected from the group consisting of carbohydrates and amino acids; 
and (c) optionally, a dispersibility-enhancing amount of a physiologically-acceptable, water- 
soluble polypeptide. 

In one embodiment, tiie excipient is a carbohydrate. The carbohydrate can be 
selected from the group consisting of monosaccharides, disaccharides, trisaccharides, and 
polysaccharides. In a preferred embodunent the carbohydrate is a monosaccharide selected 
from the group consisting of dextrose, galactose, mannitol, D-mannose, sorbitol, and sorbose. 
In another preferred embodiment the carbohydrate is a disaccharide selected from tiie group 
consisting of lactose, maltose, sucrose, and trehalose. 

In another embodiment, the excipient is an ammo acid. In one embodiment, the 
amino acid is a hydrophobic amino acid. In a preferred embodiment tiie hydrophobic amino 
acid is selected from the group consistmg of alanine, isoleucme, leucine, methionine, 
phenylalanine, proline, tryptophan, and valine. In yet anotiier embodiment tiie amino acid is a 
polar amino acid. In a preferred embodiment the amino acid is selected from the group 
consisting of argimne, histidine, lysine, cysteine, glycine, glutamine, serine, tiireonine, 
tyrosine, aspartic acid and glutamic acid. 

In one embodiment, tiie dispersibility-enhancing polypeptide is selected from tiie 
group consisting of human serum albumin, a-lactalbumin, trypsinogen, and polyalanine. 

In one embodiment, tiie spray-dried iRNA agent composition includes particles 
having a mass median diameter (MMD) of less tiian 10 microns. In another embodiment, 
the spray-dried iRNA agent composition includes particles having a mass median diameter of 
less ttian 5 microns. In yet anotiier embodiment the spray-dried iRNA agent composition 
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includes particles having a mass median aerodynamic diameter (MMAD) of less than 5 
microns. 

In certain other aspects, the invention provides kits that include a suitable container 
containing a pharmaceutical formulation of au iKNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof). In certain embodunents the individual 
components of the pharmaceutical formulation may be provided in one container. 
Alternatively, it may be desirable to provide the components of the pharmaceutical 
formulation separately in two or more containers, e.g., one container for an iRNA agent 
preparation, and at least another for a carrier compound. The kit may be packaged in a 
number of different configurations such as one or more containers ui a single box. The 
different components can be combined, e.g., according to mstructions provided with the kit. 
The components can be combined according to a method described herein, e.g., to prepare , 
and administer a pharmaceutical compositioa The kit can also include a delivery device. 

In another aspect, the invention features a device, e.g., an implantable device, wherein 
the device can dispense or administer a composition that includes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof), e.g., a iRNA agent that 
silences an endogenous transcript. In one embodiment, the device is coated with the 
composition. In another embodiment the iRNA agent is disposed vwthin the device. In 
another embodiment, the device mcludes a mechanism to dispense a unit dose of the 
composition. In other embodiments the device releases the composition continuously, e.g., 
by diffusion. Exemplary devices include stents, catheters, pumps, artificial organs or organ 
components (e.g., artificial heart, a heart valve, etc.), and sutures. 

As used herein, the term "crystalline" describes a solid having the structure or 
characteristics of a crystal, i.e., particles of three-dimensional structure in which the plane 
faces intersect at definite angles and in which there is a regular internal structure. The 
compositions of the invention may have different crystalline forms. Crystalline forms can be 
prepared by a variety of methods, includmg, for example, spray drying. 
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As used herein, "specifically hybridizable" and "complementary" are terms which are 
used to indicate a sufficient degree of complementarity such that stable and specific binding 
occurs between a compound of the invention and a target RNA molecule. Specific binding 
requires a sufficient degree of complementarity to avoid non-specific binding of the 
oligomeric compound to non-target sequences under conditions in which specific binding is 
desired, i.e., under physiological conditions in the case of in vivo assays or therapeutic 
treatment, or in the case of in vitro assays, under conditions in which the assays are 
performed. The non-target sequences typically differ by at least 5 nucleotides. 

In one embodiment, an iKNA agent is "sufficiently complementary" to a target RNA, 
e.g., a target mRNA, such that the iRNA agent silences production of protein encoded by the 
target mRNA. In another embodiment, the iRNA agent is "exactly complementary" to a 
target RNA, e.g., the target RNA and the iRNA agent anneal, preferably to form a hybrid 
made exclusively of Watson-Crick basepairs in the region of exact complementarity. A 
"sufficiently complementary" target RNA can include an internal region (e.g., of at least 10 
nucleotides) that is exactly complementary to a target RNA. Moreover, in some 
embodiments, the iRNA agent specifically discriminates a single-nucleotide difference. In 
this case, the iRNA agent only mediates RNAi if exact complementary is found in the region 
(e.g., within 7 nucleotides of) the single-nucleotide difference. 

As used herein, the term "oligonucleotide" refers to a nucleic acid molecule (RNA or 
DNA) preferably of length less than 100, 200, 300, or 400 nucleotides. 

Unless otherwise defined, all technical and scientific terms used herein have tlie same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 
pertains. The materials, methods, and examples are illustrative only and not intended to be 
limiting. Although methods and materials sunilar or equivalent to those described herein can 
be used in the practice or testing of the present invention, usefial methods and materials are 
described below. Other features and advantages of the uxvention will be apparent fi-om the 
accompanying drawings and description, and from the claims. The contents of all references, 
pending patent applications and published patents, cited throughout this application are 

■ 

hereby expressly incorporated by reference. In case of conflict, the present specification, 
including definitions, will control. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 is a structural representation of base pairing in psuedocomplementary siRNA . 
FIG. 2 is a schematic representation of dual targeting siRNAs designed to target the 
HCV genome. 

FIG. 3 is a schematic representation of psuedocomplementary, bifunctional siRNAs 

designed to target the HCV genome. 

FIG. 4 is a general synthetic scheme for incorporation of RRMS monomers into an 

oligonucleotide. 

FIG. 5 is a table of representative RRMS carriers. Panel 1 shows pyrroline-based 
RRMSs; panel 2 shows 3-hydroxyproline-based RRMSs; panel 3 shows piperidine-based 
RRMSs; panel 4 shows morpholine and piperazine-based RRMSs; and panel 5 shows 
decaUn-based RRMSs. Rl is succinate or phosphoramidate and R2 is H or a conjugate 
ligand. 

FIG, 6 A. is a graph depicting levels of luciferase mRNA in livers of CMV-Luc mice 
pCanogen) following intervenous mjection (iv) of buffer or siRNA into the tail vein. Each 
bar represents data from one mouse. RNA levels were quantified by QuantiGene Assay 
(Genospectra, Inc.; Fremont, CA)). The Y axis represents chemiluminescence values in 

coxmts per second (CPS). 

FIG. 6B. is a graph depicting levels of luciferase mRNA in livers of CMV-Luc mice 

(Xanogen). The values are averaged from the data depicted in FIG. XxxA. 

FIG. 7 is a graph depicting the pharmacokinetics of cholesterol-conjugated and 
vinconjugated siRNA. The diamonds represent the amount of miconjugated P-labeled 
siRNA (ALN-3000) in mouse plasma over time; the squares represent the amount of 
cholesterol-conjugated ^^P-labeled siRNA (ALN-3001) in mouse plasma over time. "L1163 
is equivalent to ALN3000; "L1163Chol" is equivalent to ALN-3001. 

DETAILED DESCRIPTION 
Double-stranded (dsRNA) directs the sequence-specific silencing of mRNA through 
process known as RNA interference (RNAi). The process occurs in a wide variety of 
organisms, including mammals and other vertebrates. 
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It has been demonstrated that 21-23 nt fragments of dsRNAare sequence-specific 
mediators of RNA silencing, e.g., by causing RNA degradation. While not wishing to be 
bound by theory, it may be that a molecular signal, which may be merely the specific length 
of the fragments, present in these 21-23 nt fragments recruits cellular factors that mediate 
RNAi. Described herein are methods for preparing and administering these 21-23 nt 
fragments, and other iRNAs agents, and their use for specifically mactivatuig gene function. 
The use of iRNAs agents (or recombmantly produced or chemically synthesized 
oligonucleotides of the same or similar nature) enables the targeting of specific mRNAs for 
silencing in mammalian cells. In addition, longer dsRNA agent fragments can also be used, 

e.g., as described below. 

Although, in mammaUan cells, long dsRNAs can induce the mterferon response 
which is frequently deleterious, sRNAs do not trigger the interferon response, at least not to 
an extent that is deleterious to the cell and host. In particular, the length of the iRNA agent 
strands m an sRNA agent can be less than 31, 30, 28, 25, or 23 nt, e.g., sufBciently short to 
avoid inducmg a deleterious mterferon response. Thus, the admmistration of a composition 
of sRNA agent (e.g., formulated as described herein) to a mammalian cell can be used to 
silence expression of a target gene while ckcumventmg the mterferon response. Further, use 
of a discrete species of iRNA agent can be used to selectively target one allele of a target 
gene, e.g., in a subject heterozygous for the allele. 

Moreover, in one embodiment, a mammalian cell is treated with an iRNA agent that 
disrupts a component of the mterferon response, e.g., double stranded RNA (dsRNA)- 
activated protein kmase PKR. Such a cell can be treated with a second iRNA agent that 
includes a sequence complementary to a target RNA and that has a length that might 
otherwise trigger the interferon response. 

In a typical embodiment, the subject is a mammal such as a cow, horse, mouse, rat, 
dog, pig, goat, or a primate. The subject can be a dahy mammal (e.g., a cow, or goat) or 
other farmed animal (e.g., a chiclcen, hirkey, sheep, pig, fish, shrimp). In a much preferred 
embodiment, the subject is a human, e.g., a normal mdividual or an hidividual that has, is 
diagnosed with, or is predicted to have a disease or disorder. 

Further, because iRNA agent mediated silencmg persists for several days after 
administering the iRNA agent composition, in many instances, it is possible to administer the 
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composition with a frequency of less than once per day, or, for some instances, only once for 
the entire therapeutic regimen. For example, treatment of some cancer cells may be 
mediated by a single bolus administration, whereas a chronic viral infection may require 
regular administration, e.g., once per week or once per month. 

A number of exemplary routes of delivery are described that can be used to 
administer an iRNA agent to a subject. In addition, the iRNA agent can be formulated 
according to an exemplary method described herein. 

iRNA AGENT STRUCTURE 

Described herein are isolated iRNA agents, e.g., RNA molecules, (double-stranded; 
single-stranded) that mediate RNAi. The iRNA agents preferably mediate RNAi with 
respect to an endogenous gene of a subject or to a gene of a pathogen. 

An "RNA agent" as used herein, is an unmodified RNA, modified RNA, or 
nucleoside surrogate, all of which are defined herein (see, e.g., the section below entitled 
RNA Agents). While numerous modified RNAs and nucleoside surrogates are described, 
preferred examples include those which have greater resistance to nuclease degradation than 
do unmodified RNAs. Preferred examples include those which have a T sugar modification, 
a modification in a single strand overhang, preferably a 3' single strand overhang, or, 
particularly if single stranded, a 5' modification which includes one or more phosphate 
groups or one or more analogs of a phosphate group. 

An "iRNA agent" as used herein, is an RNA agent which can, or which can be 

cleaved into an RNA agent which can, down regulate tiie expression of a target gene, 

preferably an endogenous or pathogen target RNA. While not wishing to be bound by 

theory, an IRNA agent may act by one or more of a number of mechanisms, including post- 

transcriptional cleavage of a target mRNA sometimes referred to in the art as RNAi, or pre- 

transcriptional or pre-translational mechanisms. An iRNA agent can include a single sti-and 

or can include more tiian one strands, e.g., it can be a double stranded iRNA agent. If the 

iRNA agent is a single strand it is particularly preferred that it include a 5* modification 

which includes one or more phosphate groups or one or more analogs of a phosphate group. 

The iRNA agent should include a region of sufficient homology to the target gene, 

and be of sufficient length in terms of nucleotides, such that tiie iRNA agent, or a fragment 
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thereof, can mediate down regulation of the target gene. (For ease of exposition the term 
nucleotide or ribonucleotide is sometimes used herein in reference to one or more monomeric 
subunits of an RNA agent. It will be understood herein that the usage of the term 
"ribonucleotide" or "nucleotide", herein can, in the case of a modified RNA or nucleotide 
surrogate, also refer to a modified nucleotide, or surrogate replacement moiety at one or more 
positions.) Thus, the iRNA agent is or includes a region which is at least partially, and in 
some embodhnents fully, complementary to the target RNA, It is not necessary that there be 
perfect complementarity between the iRNA agent and the target, but the correspondence 
must be sufficient to enable the iRNA agent, or a cleavage product thereof, to direct sequence 
specific silencing, e.g., by RNAi cleavage of the target RNA, e.g., mRNA. 

Complementarity, or degree of homology with the target strand, is most critical in the 
antisense strand. While perfect complementarity, particularly in the antisense strand, is often 
desired some embodiments can include, particularly in the antisense strand, one or more but 
preferably 6, 5, 4, 3, 2, or fewer mismatches (with respect to the target RNA). The 
mismatches, particularly in the antisense strand, are most tolerated in the terminal regions 
and if present are preferably in a terminal region or regions, e.g., within 6, 5, 4, or 3 
nucleotides of the 5' and/or 3' terminus. The sense strand need only be sufficiently 
complementary with the antisense strand to mamtain the over all double strand character of 
the molecule. 

As discussed elsewhere herem, an iRNA agent will often be modified or include 
nucleoside surrogates in addition to the RRMS. Single stranded regions of an iRNA agent 
will often be modified or include nucleoside surrogates, e.g., the unpaired region or regions 
of a hairpin structure, e.g., a region which links two complementary regions, can have 
modifications or nucleoside surrogates. Modification to stabilize one or more 3'- or 5'- 
terminus of an iRNA agent, e.g., against exonucleases, or to favor the antisense sRNA agent 
to enter into RISC are also favored. Modifications can include C3 (or C6, C7, C12) amino 
linkers, thiol linkers, carboxyl linkers, non-nucleotidic spacers (C3, C6, C9, CI 2, abasic, 
triethylene glycol, hexaethylene glycol), special biotin or fluorescein reagents that come as 
phosphoramidites and that have another DMT-protected hydroxyl group, allowing multiple 
couplings during RNA synthesis. 
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iRNA agents include: molecules that are long enough to trigger the interferon 
response (which can be cleaved by Dicer (Bernstein et al 2001. Nature, 409:363-366) and 
enter a RISC (RNAi-induced silencing complex)); and, molecules which are sufficiently 
short that they do not trigger the interferon response (which molecules can also be cleaved by 

5 Dicer and/or enter a RISC), e.g., molecules which are of a size which allows entry into a 
RISC, e.g., molecules which resemble Dicer-cleavage products. Molecules that are short 
enough that they do not trigger an interferon response are termed sRNA agents or shorter 
iRNA agents herein. "sRNA agent or shorter iRNA agent" as used herein, refers to an iRNA 
agent, e.g., a double stranded RNA agent or single strand agent, that is sufficiently short that 

10 it does not induce a deleterious interferon response in a human cell, e.g., it has a duplexed 
region of less than 60 but preferably less than 50, 40, or 30 nucleotide pairs. The sRNA 
agent, or a cleavage product thereof, can down regulate a target gene, e.g., by inducing RNAi 
with respect to a target RNA, preferably an endogenous or pathogen target RNA. 

Each strand of an sRNA agent can be equal to or less than 30, 25, 24, 23, 22, 21 , or 20 

16 nucleotides in length. The strand is preferably at least 19 nucleotides m length. For example, 
each strand can be between 21 and 25 nucleotides in length. Preferred sRNA agents have a 
duplex region of 17, 1 8, 19, 29, 21, 22, 23, 24, or 25 nucleotide pairs, and one or more 
overhangs, preferably one or two 3' overhangs, of 2- 3 nucleotides. 

In addition to homology to target RNA and the ability to down regulate a target gene, 

20 an iRNA agent will preferably have one or more of the following properties: 

(1) it will be of the Formula 1, 2, 3, or 4 set out in the RNA Agent section below; 

(2) if single stranded it will have a 5' modification which includes one or more 
phosphate groups or one or more analogs of a phosphate group; 

(3) it will, despite modifications, even to a very large number, or all of the 

25 nucleosides, have an antisense strand that can present bases (or modified bases) in the proper 
three dimensional framework so as to be able to form correct base pairing and form a duplex 
structure with a homologous target RNA which is sufficient to allow down regulation of the 
target, e.g., by cleavage of the target RNA; 

(4) it will, despite modifications, even to a very large number, or all of the 
30 nucleosides, still have "RNA-like" properties, /.e., it will possess the overall structural, 

chemical and physical properties of an RNA molecule, even though not exclusively, or even 
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partly, of ribonucleotide-based content. For example, an iRNA agent can contain, e.g., a 
sense and/or an antisense strand in which all of the nucleotide sugars contain e.g., 2' fluoro in 
place of 2' hydroxyl. This deoxyribonucleotide-containing agent can still be expected to 
exhibit RNA-like properties. While not wishing to be bound by theory, the electronegative 
fluorine prefers an axial orientation when attached to the C2' position of ribose. This spatial 
preference of fluorine can, in turn, force the sugars to adopt a Cy-endo pucker. This is the 
same puckering mode as observed in RNA molecules and gives rise to the RNA- 
characteristic A-family-type helix. Further, since fluorine is a good hydrogen bond acceptor, 
it can participate in the same hydrogen bonding interactions with water molecules that are 
known to stabilize RNA structures. (Generally, it is preferred that a modified moiety at the 
2' sugar position will be able to enter into H-bonding which is more characteristic of the OH 
moiety of a ribonucleotide than the H moiety of a deoxyribonucleotide. A preferred iRNA 
agent will: exhibit a Cy-endo pucker in all, or at least 50, 75,80, 85, 90, or 95 % of its 
sugars; exhibit a Cy-endo pucker in a sufficient amount of its sugars that it can give rise to a 
the RNA-characteristic A-family-type helix; will have no more than 20, 10, 5, 4, 3, 2, orl 
sugar which is not a Cy-endo pucker structure. These limitations are particularly preferably 

in the antisense strand; 

(5) regardless of the nature of the modification, and even though the RNA agent 
can contain deoxynucleotides or modified deoxynucleotides, particularly m overhang or 
other single strand regions, it is preferred that DNA molecules, or any molecule in which 
more tiian 50, 60, or 70 % of the nucleotides in tiie molecule, or more than 50, 60, or 70 % of 
the nucleotides in a duplexed region are deoxyribonucleotides, or modified 
deoxyribonucleotides which are deoxy at the 2' position, are excluded from tiie definition of 

RNA agent. 

A "single strand iRNA agent" as used herein, is an iRNA agent which is made up of a 
single molecule. It may include a duplexed region, formed by inti:a-sti:and pairing, e.g., it 
may be, or include, a hairpin or pan-handle stincture. Single strand iRNA agents are 
preferably antisense with regard to the target molecule. In preferred embodunents single 
sti^d iRNA agents are 5' phosphorylated or include a phosphoryl analog at the 5' prime 
terminus. 5'-phosphate modifications include those which are compatible with RISC 
mediated gene silencing. Suitable modifications include: 5'-monophosphate ((H0)2(0)P-0- 
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5'); 5'-diphosphate ((HO)2(0)P-0-P(HO)(0)-0-5'); 5'-triphosphate ((H0)2(0)P-0. 
(H0)(0)P-0-P(H0)(0)-0-5'); 5*-guanosine cap (7-methylated or non-methylated) (7m-G-0- 
5'-(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'); 5'-adenosine cap (Appp), and any modified or 
unmodified nucleotide cap structure (N-0-5'-(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'); 5'- 

5 monothiophosphate (phosphorothioate; (HO)2(S)P-0-5'); 5'-monodithiophosphate 
(phosphorodithioate; (H0)(HS)(S)P-0.5'), 5»-phosphorothiolate ((HO)2(0)P"S-5'); any 
additional combination of oxygen/sulfur replaced monophosphate, diphosphate and 
triphosphates (e.g. 5'-alpha-thiotriphosphate, 5'-gamma-thiotriphosphate, etc), 5*- 
phosphoramidates ((HO)2(0)P-NH-5', (HO)(NH2)(0)P-0-5'), 5'-alkylphosphonates 

10 (R=alkyl=methyl, ethyl, isopropyl, propyl, etc., e,g. RP(0H)(0)-0-5'-, (OH)2(0)P-5^-CH2-), 
5'-alkyletherphosphonates (R=alkylether=methoxymethyl (MeOCH2-), ethoxymethyl, etc., 
e.g. RP(0H)(0)-0-5'-). (These modifications can also be used with the antisense strand of a 

double stranded iRNA.) 

A single strand iRNA agent should be sufficiently long that it can enter the RISC and 
1 5 participate in RISC mediated cleavage of a target mRNA. A single strand iRNA agent is at 
least 14, and more preferably at least 15, 20, 25, 29, 35, 40, or 50nucleotides m length. It is 
preferably less than 200, 100, or 60 nucleotides in length. 

Hairpin iRNA agents will have a duplex region equal to or at least 17, 18, 19, 29, 21, 
22, 23, 24, or 25 nucleotide paks. The duplex region will preferably be equal to or less than 
20 200, 100, or 50, in length. Preferred ranges for the duplex region are 15-30, 17 to 23, 19 to 
23, and 19 to 21 nucleotides pairs in length. The hairpin will preferably have a smgle strand 
overhang or terminal unpaired region, preferably the 3', and preferably of the antisense side 
of the hairpin. Preferred overhangs are 2-3 nucleotides in length. 

A "double stranded (ds) iRNA agent" as used herein, is an iRNA agent which 
25 includes more than one, and preferably two, strands in which interchain hybridization can 
form a region of duplex structure. 

The antisense strand of a double stranded iRNA agent should be equal to or at least, 
14, 15, 16 17, 18, 19, 25, 29, 40, or 60 nucleotides in length. It should be equal to or less 
than 200, 100, or 50, nucleotides in length. Preferred ranges are 17 to 25, 19 to 23, and 19 
30 to21 nucleotides in length. 
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The sense strand of a double stranded iRNA agent should be equal to or at least 14, 
15, 16 17, 18, 19, 25, 29, 40, or 60 nucleotides in length. It should be equal to or less than 
200, 100, or 50, nucleotides in length. Preferred ranges are 17 to 25, 19 to 23, and 19 to21 

nucleotides in length. 

The double strand portion of a double stranded iRNA agent should be equal to or at 
least, 14, 15, 16 17, 18, 19, 20, 21, 22, 23, 24, 25, 29, 40, or 60 nucleotide pairs in length. It 
should be equal to or less than 200, 100, or 50, nucleotides pairs in length. Preferred ranges 
are 15-30, 17 to 23, 19 to 23, and 19 to 21 nucleotides pairs m length. 

In many embodiments, the ds iRNA agent is sufficiently large that it can be cleaved 
by an endogenous molecule, e.g., by Dicer, to produce smaller ds iRNA agents, e.g., sRNAs 
agents 

It may be desirable to modify one or both of the antisense and sense strands of a 
double strand iRNA agent. In some cases they will have the same modification or the same 
class of modification but in other cases the sense and antisense strand will have different 
modifications, e.g., m some cases it is desirable to modify only the sense strand. It may be 
desirable to modify only the sense strand, e.g., to inactivate it, e.g., the sense strand can be 
modified in order to inactivate tiie sense strand and prevent formation of an active 
sRNA/protein or RISC. This can be accomplished by a modification which prevents 5'- 
phosphorylation of the sense strand, e.g., by modification with a 5'-0-methyl ribonucleotide 
(see Nykanen et al, (2001) ATP requirements and small interfering RNA structure in the 
RNA interference pathway. Cell 107, 309-321 .) Other modifications which prevent 
phosphorylation can also be used, e.g., simply substituting the 5'-0H by H rather than 0-Me. 
Alternatively, a large bulky group may be added to the 5'-phosphate turning it into a 
phosphodiester Unkage, though this may be less desurable as phosphodiesterases can cleave 
such a Imkage and release a fimctional sRNA 5 '-end. Antisense strand modifications mclude 
5' phosphorylation as well as any of the other 5' modifications discussed herein, particularly 
the 5' modifications discussed above m the section on single stranded iRNA molecules. 

It is preferred that the sense and antisense strands be chosen such that the ds iRNA 
agent includes a smgle strand or unpaired region at one or both ends of the molecule. Thus, a 
ds iRNA agent contains sense and antisense strands, preferable pau-ed to contain an 
overhang, e.g., one or two 5' or 3' overhangs but preferably a 3' overhang of 2-3 
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nucleotides. Most embodiments will have a 3' overhang. Preferred sRNA agents will have 
single-stranded overhangs, preferably 3' overhangs, of 1 or preferably 2 or 3 nucleotides in 
length at each end. The overhangs can be the result of one strand being longer than the other, 
or the result of two strands of the same length being staggered. 5' ends are preferably 
6 phosphorylated. 

Preferred lengths for the duplexed region is between 15 and 30, most preferably 18, 
19, 20, 21, 22, and 23 nucleotides in length, e.g., in the sRNA agent range discussed above. 
sKNA agents can resemble in length and structure the natural Dicer processed products from 
long dsRNAs, Embodiments in which the two strands of the sRNA agent are linked, e.g., 
10 covalently linked are also included. Hairpin, or other single strand structures which provide 
the required double stranded region, and preferably a 3* overhang are also within the 
invention. 

The isolated iRNA agents described herein, including ds iRNA agents and sRNA 
agents can mediate silencing of a target RNA, e.g., mRNA, e.g., a transcript of a gene that 

15 encodes a protein. For convenience, such mRNA is also referred to herein as mRNA to be 
silenced. Such a gene is also referred to as a target gene. In general, the RNA to be silenced 
is an endogenous gene or a pathogen gene. In addition, RNAs other than mRNA, e.g., 
tRNAs, and viral RNAs, can also be targeted. 

As used herein, the phrase "mediates RNAi" refers to the ability to silence, in a 

20 sequence specific manner, a target RNA. While not wishing to be bound by theory, it is 

believed that silencing uses the RNAi machinery or process and a guide RNA, e.g., an sRNA 
agent of 21 to 23 nucleotides. 

As used herein, "specifically hybridizable" and "complementary" are terms which are 
used to indicate a sufficient degree of complementarity such that stable and specific binding 

25 occurs between a compound of the invention and a target RNA molecule. Specific binding 
requires a sufficient degree of complementarity to avoid non-specific binding of the 
oligomeric compound to non-target sequences under conditions in which specific binding is 
desu^ed, i.e., under physiological conditions in the case of in vivo assays or therapeutic 
treatment, or in the case of in vitro assays, under conditions in which the assays are 

30 performed. The non-target sequences typically differ by at least 5 nucleotides. 
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In one embodiment, an iRNA agent is "sufficiently complementary" to a target RNA, 
e.g., a target mRNA, such that the iRNA agent silences production of protein encoded by the 
target mRNA. In another embodiment, the iRNA agent is "exactly complementary" 
(excluding the RRMS containing subunit(s))to a target RNA, e.g., the target RNA and the 
iRNA agent anneal, preferably to form a hybrid made exclusively of Watson-Crick basepairs 
m the region of exact complementarity. A "sufficiently complementary" target RNA can 
include an internal region (e.g., of at least 10 nucleotides) that is exactly complementary to a 
target RNA. Moreover, in some embodiments, the iRNA agent specifically discriminates a 
single-nucleotide difference. In this case, the iRNA agent only mediates RNAi if exact 
complementary is found in the region (e.g., within 7 nucleotides of) the single-nucleotide 
difference. 

As used herein, the term "oligonucleotide" refers to a nucleic acid molecule (RNA or 
DNA) preferably of length less than 100, 200, 300, or 400 nucleotides. 

RNA agents discussed herein include otherwise unmodified RNA as well as RNA 
which have been modified, e.g., to unprove efficacy, and polymers of nucleoside surrogates. 
Unmodified RNA refers to a molecule in which the components of the nucleic acid, namely 
sugars, bases, and phosphate moieties, are the same or essentially the same as that which 
occur in nature, preferably as occur naturally in the human body. The art has referred to rare 
or unusual, but naturally occurring, RNAs as modified RNAs, see, e.g., Limbach et al, 
(1994) Summary: the modified nucleosides of RNA, Nucleic Acids Res. 22: 2183-2196. 
Such rare or unusual RNAs, often termed modified RNAs (apparently because the are 
typically the result of a post transcriptionally modification) are withm the term xmmodified 
RNA, as used herein. Modified RNA as used herein refers to a molecule in which one or 
more of the components of the nucleic acid, namely sugars, bases, and phosphate moieties, 
are different firom that which occur in nature, preferably different from that which occurs in 
the human body. While they are referred to as modified "RNAs," they will of course, 
because of the modification, include molecules which are not RNAs. Nucleoside surrogates 
are molecules in which the ribophosphate backbone is replaced with a non-ribophosphate 
construct that allows the bases to the presented in the correct spatial relationship such that 
hybridization is substantially similar to what is seen with a ribophosphate backbone, e.g.. 
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non-charged mimics of the ribophosphate backbone. Examples of all of the above are 
discussed herein. 

Much of the discussion below refers to single strand molecules. In many 
embodiments of the invention a double stranded iRNA agent, e.g., a partially double stranded 
iRNA agent, is required or preferred. Thus, it is understood that that double stranded 
structures (e.g. where two separate molecules are contacted to form the double stranded 
region or where the double stranded region is formed by intramolecular paking (e.g., a 
hairpin structure)) made of the single stranded structures described below are within the 
invention. Preferred lengths are described elsewhere herein. 

As nucleic acids are polymers of subunits or monomers, many of the modifications 
described below occur at a position which is repeated within a nucleic acid, e.g., a 
modification of a base, or a phosphate moiety, or the a non-linking O of a phosphate moiety. 
In some cases the modification will occur at all of the subject positions in the nucleic acid but 
in many, and infact in most cases it will not. By way of example, a modification may only 
occur at a 3* or 5' terminal position, may only occur in a terminal regions, e.g. at a position 
on a terminal nucleotide or in the last 2, 3, 4, 5, or 10 nucleotides of a strand. A modification 
may occur m a double strand region, a single strand region, or in both. A modification may 
occur only in the double strand region of an RNA or may only occur in a single strand region 
of an RNA. £.g., a phosphorothioate modification at a non-linking O position may only 
occur at one or both termini, may only occur in a terminal regions, e.g., at a position on a 
terminal nucleotide or in the last 2, 3, 4, 5, or 10 nucleotides of a strand, or may occur in 
double strand and single strand regions, particularly at termini. The 5' end or ends can be 
phosphorylated. 

In some embodunents it is particularly preferred, e.g., to enhance stability, to include 
particular bases in overhangs, or to include modified nucleotides or nucleotide surrogates, in 
single strand overhangs, e.g., in a 5' or 3' overhang, or in both. £.g., it can be desirable to 
mclude purine nucleotides in overhangs. In some embodiments all or some of the bases in a 
3' or 5' overhang will be modified, e.g., with a modification described herein. Modifications 
can include, e.g., the use of modifications at the 2' OH group of the ribose sugar, e.g., the use 
of deoxyribonucleotides, e.g., deoxythymidine, instead of ribonucleotides, and modifications 
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in the phosphate group, e,g., phosphothioate modifications. Overhangs need not be 
homologous with the target sequence. 

Modifications and nucleotide surrogates are discussed below. 




The scaffold presented above in Formula 1 represents a portion of a ribonucleic acid. 
The basic components are the ribose sugar, the base, the terminal phosphates, and phosphate 
10 intemucleotide linkers. Where the bases are naturally occurring bases, e.g., adenine, uracil, 
guanine or cytosine, the sugars are the unmodified 2' hydroxyl ribose sugar (as depicted) and 
W, X, Y, and Z are all O, Formula 1 represents a naturally occurring unmodified 
oligoribonucleotide. 

Unmodified oligoribonucleotides may be less than optimal in some applications, e.g., 
15 unmodified oligoribonucleotides can be prone to degradation by e.g., cellular nucleases. 
Nucleases can hydrolyze nucleic acid phosphodiester bonds. However, chemical 
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modifications to one or more of the above RNA components can confer improved properties, 
and, e.g., can render oligoribonucleotides more stable to nucleases. Umodified 
oligoribonucleotides may also be less than optimal in terms of offering tethermg points for 
attaching ligands or other moieties to an iRNA agent. 
5 Modified nucleic acids and nucleotide surrogates can include one or more of: 

(i) alteration, e.g., replacement, of one or both of the non-linking (X and Y) 
phosphate oxygens and/or of one or more of the linking (W and Z) phosphate oxygens 
(When the phosphate is in the terminal position, one of the positions W or Z will not link the 
phosphate to an additional element in a naturally occurring ribonucleic acid. However, for 

10 simplicity of terminology, except where otherwise noted, the W position at the 5' end of a 
nucleic acid and the termmal Z position at the 3' end of a nucleic acid, are within the term 
"linking phosphate oxygens" as used herein.); 

(ii) alteration, e.g., replacement, of a constituent of the ribose sugar, e.g., of the T 
hydroxyl on the ribose sugar, or wholesale replacement of the ribose sugar with a structure 

1 5 other than ribose, e.g., as described herein; 

(iii) wholesale replacement of the phosphate moiety (bracket I) with "dephospho" 

linkers; 

(iv) modification or replacement of a naturally occurring base; 

(v) replacement or modification of the ribose-phosphate backbone (bracket H); 

20 (vi) modification of the 3 ' end or 5 ' end of the RNA, e.g., removal, modification or 

replacement of a terminal phosphate group or conjugation of a moiety, e.g. a fluorescently 

labeled moiety, to either the 3' or 5' end of RNA. 

The terms replacement, modification, alteration, and the like, as used in this context, 

do not imply any process limitation, e.g., modification does not mean that one must stait with 
25 a reference or naturally occurring ribonucleic acid and modify it to produce a modified 

ribonucleic acid bur rather modified shnply indicates a difference from a naturally occurrmg 

molecule. 

It is understood that the actual electronic structure of some chemical entities cannot 
be adequately represented by only one canonical form (i.e. Lewis structure). While not 
30 wishing to be bound by theory, tlie actual structure can instead be some hybrid or weighted 
average of two or more canonical forms, known collectively as resonance forms or 
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Structures. Resonance structures are not discrete chemical entities and exist only on paper. 
They differ from one another only in Ihe placement or "localization" of the bonding and 
nonbonding electrons for a particular chemical entity. It can be possible for one resonance 
structure to contribute to a greater extent to the hybrid than the others. Thus, the written and 
5 graphical descriptions of the embodiments of the present invention are made in terms of what 
the art recognizes as the predominant resonance form for a particular species. For example, 
any phosphoroamidate (replacement of a nonlinking oxygen with nitrogen) would be 
represented by X = O and Y = N in the above figure. 

Specific modifications are discussed in more detail below. 

10 The Phosphate Group 

The phosphate group is a negatively charged species. The charge is distributed 
equally over the two non-linking oxygen atoms (i.e., X and Y in Formula 1 above). However, 
the phosphate group can be modified by replacing one of the oxygens with a different 
substituent. One result of this modification to RNA phosphate backbones can be increased 

1 5 resistance of the oligoribonucleotide to nucleolytic breakdown. Thus while not wishing to be 
bound by theory, it can be desirable in some embodiments to introduce alterations which 
result in either an uncharged linker or a charged linker with unsymmetrical charge 
distributioiL 

Examples of modified phosphate groups include phosphorothioate, 
20 phosphoroselenates, borano phosphates, borano phosphate esters, hydrogen phosphonates, 
phosphoroamidates, alkyl or aryl phosphonates and phosphotriesters. Phosphorodithioates 
have both non-linking oxygens replaced by sulfur. Unlike the situation where only one of X 
or Y is altered, the phosphorus center in the phosphorodithioates is achiral which precludes 
the formation of oligoribonucleotides diastereomers. Diastereomer formation can result in a 
25 preparation in which the individual diastereomers exhibit varying resistance to nucleases. 
Further, the hybridization affinity of RNA containing chiral phosphate groups can be lower 
relative to the corresponding unmodified RNA species. Thus, while not wishmg to be bound 
by theory, modifications to both X and Y which eliminate the chiral center, e.g. 
phosphorodithioate formation, may be desirable in that they cannot produce diastereomer 
30 mixtures. Thus, X can be any one of S, Se, B, C, H, N, or OR (R is alkyl or aryl). Thus Y 
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can be any one of S, Se, B, C, H, H or OR (R is alkyl or aryl). Replacement of X and/or Y 
with sulfur is preferred. 

The phosphate linker can also be modified by replacement of a linking oxygen (i.e., 
W or Z in Formula 1) with nitrogen (bridged phosphoroamidates), sulfur (bridged 
phosphorothioates) and carbon (bridged methylenephosphonates). The replacement can 
occur at a terminal oxygen (position W (3') or position Z (5'). Replacement of W with 
carbon or Z with nitrogen is preferred. 

Candidate agents can be evaluated for suitability as described below. 

The Sugar Group 

A modified RNA can include modification of all or some of the sugar groups of the 
ribonucleic acid. E.g., the 2' hydroxyl group (OH) can be modified or replaced with a 
number of different "oxy" or "deoxy" substituents. While not being bound by theory, 
enhanced stability is expected since the hydroxyl can no longer be deprotonated to form a 2' 
alkoxide ion. The 2' alkoxide can catalyze degradation by intramolecular nucleophilic attack 
on the linker phosphorus atom. Again, while not wishing to be bound by theory, it can be 
desirable to some embodiments to introduce alterations in which alkoxide formation at the 2' 

position is not possible.' 

Examples of "oxy"-2' hydroxyl group modifications include alkoxy or aryloxy (OR, 
e.g., R = H, alkyl, cycloalkyl, aryl, aralkyl, heteroaryl or sugar); polyethyleneglycols (PEG), 
0(CH2CH20)nCH2CH20R; "locked" nucleic acids (LNA) in which the 2' hydroxyl is 
connected, e.g., by a methylene bridge, to the 4' carbon of the same ribose sugar; O-AMINE 
(AMINE = NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl 
amino, or diheteroaryl ammo, ethylene diamine, polyamino) and aminoalkoxy, 
0(CH2)nAMINE, (e.g., AMINE = NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, 
diaryl amino, heteroaryl amino, or diheteroaryl amino, ethylene diamine, polyamino). It is 
noteworthy that oligonucleotides containing only the methoxyethyl group (MOE), 
(OCH2CH2OCH3, a PEG derivative), exhibit nuclease stabilities comparable to those 
modified with the robust phosphorothioate modification, 

"Deoxy" modifications include hydrogen {i.e. deoxyribose sugars, which are of 
particular relevance to the overhang portions of partially ds RNA); halo (e.g., fluoro); amino 
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{e,g, NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl 
amino, diheteroaryl amino, or amino acid); NH(CH2CH2NH)nCH2CH2-AMI]SIE (AMINE - 
NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino,or 
diheteroaryl amino), -NHC(0)R (R = aUcyl, cycloalkyl, aryl, aralkyl, heteroaryl or sugar), 
cyano; mercapto; alkyl-thio-alkyl; thioalkoxy; and alkyl, cycloalkyl, aryl, alkenyl and 
alkynyl, which may be optionally substituted with e.g., an amino functionality. Preferred 
substitutents are 2'-methoxyethyl, 2'-OCH3, 2'-0-allyl, 2'-C- allyl, and 2'-fluoro. 

The sugar group can also contain one or more carbons that possess the opposite 
stereochemical configuration than that of the corresponding carbon in ribose. Thus, a 
modified RNA can include nucleotides containing e.g., arabinose, as the sugar. 

Modified RNA*s can also include "abasic" sugars, which lack a nucleobase at C-1'. 
These abasic sugars can also be fiirther contain modifications at one or more of the 
constituent sugar atoms. 

To maximize nuclease resistance, the 2' modifications can be used in combination 
with one or more phosphate linker modifications (e.g., phosphorothioate). The so-called 
"chimeric" oligonucleotides are those that contain two or more different modifications. 

The modificaton can also entail the wholesale replacement of a ribose structure with 
another entity at one or more sites in the iRNA agent. These modifications are described in 
section entitled Ribose Replacements for RRMSs. 

Candidate modifications can be evaluated as described below. 

Replacement of the Phosphate Group 

The phosphate group can be replaced by non-phosphorus containing connectors (c/ 
Bracket I in Formula 1 above). While not wishing to be bound by theory, it is believed that 
since the charged phosphodiester group is the reaction center in nucleolytic degradation, its 
replacement with neutral structui-al mimics should impart enhanced nuclease stability. 
Again, while not wishing to be bound by theory, it can be desirable, in some embodiment, to 
introduce alterations in which the charged phosphate group is replaced by a neutral moiety. 

Examples of moieties which can replace the phosphate group include siloxane, 
carbonate, carboxymethyl, carbamate, amide, thioether, ethylene oxide linker, sulfonate, 
sulfonamide, thioformacetal, formacetal, oxime, methyleneimino, methylenemethylimino, 
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methylenehydrazo, methylenedimelliylhydrazo and methyleneoxymethylimino. Preferred 
replacements include the methylenecarbonylamino and methylenemethylimino groups. 
Candidate modifications can be evaluated as described below. 

Replacement of Ribonhosphate Backbone 

Oligonucleotide- mimicking scaffolds can also be constructed wherein the phosphate 
linker and ribose sugar are replaced by nuclease resistant nucleoside or nucleotide surrogates 
(see Bracket IL of Formula 1 above). While not wishing to be bound by theory, it is believed 
that the absence of a repetitively charged backbone diminishes binding to proteins that 
recognize polyanions (e.g. nucleases). Again, while not wishing to be bound by theory, it 
can be desirable in some embodiment, to introduce alterations in which the bases are tethered 

by a neutral surrogate backbone. 

Examples include the mophilino, cyclobutyl, pyrrolidine and peptide nucleic acid 
(PNA) nucleoside surrogates. A preferred surrogate is a PNA surrogate. 

Candidate modifications can be evaluated as described below. 

Terminal Modifications 

The 3' and 5' ends of an oligonucleotide can be modified. Such modifications can be 
at the 3' end, 5' end or both ends of the molecule. They can include modification or 
replacement of an entire terminal phosphate or of one or more of the atoms of the phosphate 
group. Kg., the 3' and 5' ends of an oligonucleotide can be conjugated to other fimctional 
molecular entities such as labeling moieties, e.g., fluorophores (e.g., pyrene, TAMRA, 
fluorescein, Cy3 or Cy5 dyes) or protecting groups (based e.g., on sulfiir, silicon, boron or 
ester). The fimctional molecular entities can be attached to the sugar through a phosphate 
group and/or a spacer. The terminal atom of the spacer can connect to or replace the linking 
atom of the phosphate group or the C-3' or C-5' O, N, S or C group of the sugar. 
Alternatively, the spacer can cormect to or replace the terminal atom of a nucleotide 
surrogate (e.g., PNAs). These spacers or linkers can include e.g., -(CH2)n-, -(CH2)nN-, - 
(CH2)nO-, -(CH2)„S., 0(CH2CH20)nCH2CH20H (e.g., n = 3 or 6), abasic sugars, amide, 
carboxy, amine, oxyamine, oxyimme, thioether, disulfide, thiourea, sulfonamide, or 
morpholino, or biotin and fluorescein reagents. When a spacer/phosphate-fiinctional 
molecular entity-spacer/phosphate array is interposed between two strands of iRNA agents, 

72 



wo 2004/080406 



PCT/US2004/007070 



this array can substitute for a hairpin RNA loop in a hairpin-type RNA agent. The 3' end can 
be an -OH group. While not wishing to be bound by theory, it is believed that conjugation of 
certain moieties can improve transport, hybridization, and specificity properties. Again, 
while not wishing to be bound by theory, it may be desirable to introduce terminal alterations 

5 that improve nuclease resistance. Other examples of terminal modifications include dyes, 
intercalating agents (e.g. acridines), cross-linkers (e.g. psoralene, mitomycin C), porphyrins 
(TPPC4, texaphyrin, Sapphyrin), polycyclic aromatic hydrocarbons (e.g., phenazine, 
dihydrophenazine), artificial endonucleases (e.g. EDTA), lipophilic carriers (e.g., cholesterol, 
cholic acid, adamantane acetic acid, l-pyrene butyric acid, dihydrotestosterone, 1,3-Bis- 

10 0(hexadecyl)glycerol, geranyloxyhexyl group, hexadecylglycerol, bomeol, menthol, 1,3- 
propanediol, heptadecyl group, palmitic acid, myristic acid,03-(oleoyl)lithocholic acid, 03- 
(oieoyl)cholenic acid, dimethoxytrityl, or phenoxazine)and peptide conjugates (e.g., 
antennapedia peptide, Tat peptide), alkylating agents, phosphate, amino, mercapto, PEG 
(e.g., PEG-40K), MPEG, [MPEG]2, polyamino, alkyl, substituted alkyl, radiolabeled 

15 markers, enzymes, haptens (e.g. biotin), transport/absorption facilitators (e.g., aspirin, 
vitamin E, folic acid), synthetic ribonucleases (e.g., imidazole, bisimidazole, histamine, 
imidazole clusters, acridine-imidazole conjugates, Eu3+ complexes of tetraazamacrocycles). 

Terminal modifications can be added for a number of reasons, including as discussed 
elsewhere herein to modulate activity or to modulate resistance to degradation. Terminal 

20 modifications useful for modulating activity include modification of the 5' end with 

phosphate or phosphate analogs. Kg,, in preferred embodiments iRNA agents, especially 
antisense strands, are 5' phosphorylated or include a phosphoryl analog at the 5' prime 
terminus. 5 -phosphate modifications include those which are compatible with RISC 
mediated gene silencing. Suitable modifications include: 5 -monophosphate ((H0)2(0)P-0- 

25 5'); S'-diphosphate ((H0)2(0)P-0-P(H0)(0)-0-5'); 5 '-triphosphate ((H0)2(0)P-0- 

(H0)(0)P-0-P(H0)(0)-0-5'); 5'-guanosine cap (7-methylated or non-methylated) (7m-G-0- 
5'-(HO)(O)P-O-(HO)(O)P-0-P(H0)(0)-O-5'); 5'-adenosine cap (Appp), and any modified or 
unmodified nucleotide cap structure (N-0-5'-(H0)(0)P-0-(H0)(0)P-0-P(H0)(0)-0-5'); 5'- 
monothiophosphate (phosphorothioate; (HO)2(S)P-0-5'); 5'-monodithiophosphate 

30 (phosphorodithioate; (H0)(HS)(S)P-0-5'), 5'-phosphorothiolate ((HO)2(O)P.S.50; any 
additional combination of oxgen/sulfur replaced monophosphate, diphosphate and 
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triphosphates (e.g, 5 -alpha-thiotriphosphate, 5 -gamma-thiotriphosphate, etc.), 5'- 
phosphoramidates ((HO)2(0)P-NH-5', (HO)(NH2)(0)P-0-5'), 5'-alkylphosphonates 
(R=alkyl=inethyl, ethyl, isopropyl, propyl, etc., e.g. RP(0H)(0)-0-5'-, (OH)2(0)P-5'-CH2-), 
5 -alkyletherphosphonates (R-aIkylether=methoxymethyl (MeOCH2-), ethoxymethyl, etc., 
5 e.g. RP(0H)(0)-0-5'-). 

Terminal modifications useful for increasing resistance to degradation include 
Terminal modifications can also be useful for monitoring distribution, and in such 
cases the preferred groups to be added include fluorophores, e.g., fluorscein or an Alexa dye, 
e.g., Alexa 488. Terminal modifications can also be useful for enhancing uptake, useful 
10 modifications for this include cholesterol. Terminal modifications can also be useful for 
cross-linking an RNA agent to another moiety; modifications useful for this include 
mitomycin C. 

Candidate modifications can be evaluated as described below. 
The Bases 

15 Adenine, guanine, cytosine and uracil are the most common bases foimd in RNA. 

These bases can be modified or replaced to provide KNA's having improved properties. 
Kg,, nuclease resistant oligoribonucleotides can be prepared with these bases or with 
synthetic and natural nucleobases (e.g., inosine, thymine, xanthine, hypoxanthine, 
nubularine, isoguanisine, or tubercidine) and any one of the above modifications. 

20 Alternatively, substituted or modified analogs of any of the above bases, e.g., "unusual 
bases" and "universal bases," can be employed. Examples include without limitation 2- 
aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and 
other alkyl derivatives of adenine and guanine, 5-halouracil and cytosine, 5-propynyl uracil 
and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil (pseudouracil), 4-thiouracil, 5- 

25 halouracil, 5-(2-aminopropyl)uracil, 5-amino allyl uracil, 8-halo, amino, thiol, thioalkyl, 
hydroxyl and other 8-substituted adenines and guanines, 5-trifluoromethyl and other 5- 
substituted uracils and cytosines, 7-methylguanine, 5-substituted pyrimidines, 6- 
azapyrimidines and N-2, N-6 and 0-6 substituted purines, including 2-aminopropyladenine, 
5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3-deaza-5-azacytosine, 2- 

30 aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,7-deazaadenme, N6, N6- 
dimethyladenine, 2,6-diaminopurine, 5-amino-allyl-uracil, N3-methyluracil, substituted 
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1,2,4-triazoles, 2-pyridinone, 5-nitroindole, 3-nitropyrrole, 5-methoxyuracil, uracil-5- 
oxyacetic acid, S-methoxycarbonylmethyluracil, 5-methyl-2-thiouracil, 5- 
methoxycarbonylmethyl-2-thioiu:acil, 5-methylainmomethyl-2-thiouracil, 3-(3-ammo- 
3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N^-acetyl cytosine, 2- 

5 thiocytosine, N6"methyladenine, N6-isopentyladenine, 2-methylthio-N6-isopentenyladenine, 
N-methylguanines, or 0-alkylated bases. Further purines and pyrimidines include those 
disclosed in U.S. Pat. No. 3,687,808, those disclosed in the Concise Encyclopedia Of 
Polymer Science And Engineering, pages 858-859, Kroschwitz, J, I., ed. John Wiley & Sons, 
1990, and those disclosed by Englisch et al, Angewandte Chemie, International Edition, 

10 1991,30,613. 

Generally, base changes are less preferred for promoting stability, but they can be 
xisefiil for other reasons, e.g., some, e.g., 2,6-diaminopimne and 2 amino purine, are 
fluorescent. Modified bases can reduce target specificity. This should be taken into 
consideration in the design of iRNA agents. 
15 Candidate modifications can be evaluated as described below. 



Evaluation of Candidate RNA's 

One can evaluate a candidate RNA agent, e.g., a modified RNA, for a selected 
property by exposing the agent or modified molecule and a control molecule to the 

20 appropriate conditions and evaluating for the presence of the selected property. For example, 
resistance to a degradent can be evaluated as follows. A candidate modified RNA (and 
preferably a control molecule, usually the unmodified form) can be exposed to degradative 
conditions, e.g., exposed to a miUeu, which includes a degradative agent, e.g., a nuclease. 
Kg,, one can use a biological sample, e.g., one that is similar to a milieu, which might be 

25 encountered, in therapeutic use, e.g., blood or a cellular fi:action, e.g., a cell-fi:ee homogenate 
or disrupted cells. The candidate and control could then be evaluated for resistance to 
degradation by any of a number of approaches. For example, the candidate and control could 
be labeled, preferably prior to exposure, with, e.g., a radioactive or enzymatic label, or a 
fluorescent label, such as Cy3 or Cy5. Control and modified RNA's can be incubated with 

30 the degradative agent, and optionally a control, e.g., an inactivated, e.g., heat inactivated, 
degradative agent. A physical parameter, e.g., size, of the modified and control molecules 
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are then detennined. They can be determined by a physical method, e.g., by polyacrylamide 
gel electrophoresis or a sizing colxmm, to assess whether the molecule has maintained its 
original length, or assessed functionally. Alternatively, Northern blot analysis can be used to 
assay the length of an unlabeled modified molecule. 

5 A functional assay can also be used to evaluate the candidate agent A functional 

assay can be applied initially or afler an earlier non-functional assay, {e.g., assay for 
resistance to degradation) to determine if the modification alters the ability of the molecxile to 
silence gene expression. For example, a cell, e.g., a mammalian cell, such as a mouse or 
human cell, can be co-transfected with a plasmid expressing a fluorescent protein, e.g., GFP, 

1 0 and a candidate RNA agent homologous to the transcript encoding the fluorescent protein 
(see, e.g., WO 00/44914). For example, a modified dsRNA homologous to the GFP mRNA 
can be assayed for the ability to inhibit GFP expression by monitoring for a decrease in cell 
fluorescence, as compared to a control cell, in which the transfection did not include the 
candidate dsRNA, e.g., controls with no agent added and/or controls with a non-modified 

15 RNA added. Efficacy of the candidate agent on gene expression can be assessed by 

comparing cell fluorescence in the presence of the modified and unmodified dsRNA agents. 

In an alternative functional assay, a candidate dsRNA agent homologous to an 
endogenous mouse gene, preferably a maternally expressed gene, such as c-moj, can be 
injected into an immature mouse oocyte to assess the ability of the agent to inhibit gene 

20 expression in vivo (see, e.g., WO 01/36646). A pbenotype of the oocyte, e.g., the ability to 
maintain arrest in metaphase II, can be monitored as an indicator that the agent is inhibiting 
expression. For example, cleavage of c-mos mRNA by a dsRNA agent would cause the 
oocyte to exit metaphase arrest and initiate parthenogenetic development (CoUedge et al. 
Nature 370: 65-68, 1994; Hashimoto et al. Nature, 370:68-71, 1994). The effect of the 

25 modified agent on target RNA levels can be verified by Northern blot to assay for a decrease 
in the level of target mRNA, or by Western blot to assay for a decrease in the level of target 
protein, as compared to a negative control. Controls can include cells in which with no agent 
is added and/or cells in which a non-modified RNA is added. 
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oligoribonucleosides as well as mixed backbone compounds having, as for mstance, 
alternating MMI and PO or PS linkages can be prepared as is described in U.S. Pat. Nos. 
5,378,825, 5,386,023, 5,489,677 and m published PCX applications PCT/US92/04294 and 

20 PCTAJS92/04305 (published as WO 92/20822 WO and 92/20823, respectively). Fonnacetal 
and thioformacetal linked oligoribonucleosides can be prepared as is described in U.S. Pat 
Nos. 5,264,562 and 5,264,564. Ethylene oxide linked oligoribonucleosides can be prepared 
as is described in U.S. Pat. No. 5,223,618. Siloxane replacements are described m 
Cormier,J.F. et al Nucleic Acids Res. 1988, 16, 4583. Carbonate replacements are described 

25 in Tittensor, J.R. J. Chem. Soc. C 1971, 1933. Carboxymethyl replacements are described in 
Edge, M.D. etal J. Chem. Soc. PerMn Trans. 1 1972, 1991. Carbamate replacements are 
described in Stirchak, E.P. Nucleic Acids Res. 1989, 17, 6129. 

Replacement of the Phosphate-Ribose Backbone References 
30 Cyclobutyl sugar surrogate compounds can be prepared as is described in U.S. Pat. 

No. 5,359,044. Pyrrolidine sugar surrogate can be prepared as is described in U.S. Pat. No. 
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5,5 19,134. Morpholino sugar surrogates can be prepared as is described in U.S. Pat. Nos. 
5,142,047 and 5,235,033, and other related patent disclosures. Peptide Nucleic Acids (PNAs) 
are known per se and can be prepared in accordance with any of the varioxis procedures 
referred to in Peptide Nucleic Acids (PNA): Synthesis, Properties and Potential Applications, 
5 Bioorganic & Medicinal Chemistry, 1996, 4, 5-23. They may also be prepared in accordance 
with U.S. Pat. No. 5,539,083. 

Terminal Modification References 

Terminal modifications are described in Manoharan, M. et aL Antisense and Nucleic 
10 Acid Drug Development 12, 103-128 (2002) and references therein. 

Bases References 

N-2 substitued purine nucleoside amidites can be prepared as is described in U.S. Pat. 
No. 5,459,255. 3-Deaza purine nucleoside amidites can be prepared as is described in U.S. 
15 Pat No. 5,457,1 91 . 5,6-Substituted pyrimidine nucleoside amidites can be prepared as is 
described in U.S. Pat. No. 5,614,617. 5-Propynyl pyrimidine nucleoside amidites can be 
prepared as is described in U.S. Pat. No. 5,484,908. Additional references can be disclosed 
in the above section on base modifications. 
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Preferred iRNA Agents 

Preferred RNA agents have the following structure (see Formula 2 below): 




FORMULA 2 

Referring to Formula 2 above, K\ R^ and R^ are each, independently, H, (/.e. abasic 
nucleotides), adenine, guanine, cytosine and uracil, inosine, thymine, xanthine, 
1 0 hypoxanthine, nubularine, tubercidine, isoguanisine, 2-aminoadenine, 6-methyl and other 
alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and 
guanine, 5-halouracil and cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and 
thymine, 5-uracil (pseudouracil), 4-thiouracil, 5-halouracil, 5-(2-aminopropyl)uracil, 5-amino 
allyl uracil, 8-halo, amino, thiol, thioalkyl, hydroxyl and other 8-substituted adenines and 
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guanines, S-trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine, 
5-substituted pyrimidines, 6-azapyrimidines and N-2, N-6 and 0-6 substituted purines, 
including 2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3- 
deaza-5-azacytosine, 2-aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,?- 
deazaadenine, 7-deazaguanine, N6, N6-dimethyladenine, 2,6-diaminopurine, 5-amino-allyl- 
uracil, N3-methyluracil, substituted 1,2,4-triazoles, 2-pyridinone, 5-nitroindole, 3- 
nitropyrrole, S-methoxyuracil, uracil-5-oxyacetic acid, 5-methoxycarbonylmethyluracil, 5- 
methyl-2-thiouracil, 5-methoxycarbonylmethyl-2-thiouracil, 5-methylaniinomethyl-2- 
thiouracil, 3-(3-amino-3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N'^-acetyl 
cytosine, 2-thiocytosine, N6-methyladenine, N6-isopentyladenine, 2-methylthio-N6- 
isopentenyladenine, N-methylguanines, or 0-alkylated bases, 

R\ R^ and are each, independently, 0R^ 0(CH2CH20)mCH2CH20R^ 
0(CH2)„R^ 0(CH2)„0R^ H; halo; NH2; NHR*^; N(R^)2; NH(CH2CH2NH)„,CH2CH2NHR^ 
NHC(0)R^ ; cyano; mercapto, SR^ alkyl-thio-alkyl; alkyl, aralkyl, cycloalkyl, aryl, 
heteroaryl, alkenyl, alkynyl, each of which may be optionally substituted with halo, hydroxy, 
0x0, nitro, haloalkyi, alkyl, alkaryl, aryl, aralkyl, alkoxy, aryloxy, amino, alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, diheteroaryl amino, 
acylamino, alkylcarbamoyl, arylcarbamoyl, aminoalkyi, alkoxycarbonyl, carboxy, 
hydroxyalkyl, alkanesulfonyl, alkanesulfonamido, arenesulfonamido, aralkylsulfonamido, 
alkylcarbonyl, acyloxy, cyano, or ureido; or R"^, R^ or R^ together combine with R^ to form 
an [-O-CH2-] covalently bound bridge between the sugar 2' and 4' carbons. 
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A is: 



Xi=P Yi 



Wi 



Wi 



■Yi 



or 



Xr 



•Yi 



Xt 



•Yi 



or 



■Yi 



; H; OH; OCH3; W'; an abasic nucleotide; or absent; 

(a preferred Al , especially with regard to anti-sense strands, is chosen from 5'- 
monophosphate ((HO)2(0)P-0-5'), 5'-diphosphate ((HO)2(0)P-0-P(HO)(0)-0-5'), 5'- 
triphosphate ((HO)2(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5'), 5'-guanosine cap (7-methylated or 
non-methylated) (7m-G-0-5'-(HO)(0)P-0-(HO)(0)P-0-P(HO)(0)-0-5*), 5'-adenosine cap 
(Appp), and any modified or unmodified nucleotide cap structure (N-0-5'-(H0)(0)P-0- 
(H0)(O)P-O-P(HO)(O)-0-5'), 5'-monothiophosphate (phosphorothioate; (HO)2(S)P-0-5'), 5'- 
monodithiophosphate (phosphorodithioate; (H0)(HS)(S)P-0-5'), 5'-phosphorothiolate 
((HO)2(0)P-S-5'); any additional combination of oxgen/sulfur replaced monophosphate, 
diphosphate and triphosphates (e.g. 5'-alpha-thiotriphosphate, 5'-gamma-thiotriphosphate, 
etc.). 5'-phosphoramidates ((HO)2(0)P-NH-5', (HO)(NH2)(0)P-0-5'), 5'-alkylphosphonates 
(R=alkyl=methyl, ethyl, isopropyl, propyl, etc., e.g. RP(0H)(0)-0-5'-, (OH)2(0)P-5'-CH2-), 
5'-alkyletherphosphonates (R=alkylether=methoxymethyl (MeOCH2-), ethoxymethyl, etc., 
e.g. RP(0H)(0)-0-5'-)). 
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A^is: 



X2=P Y; 



Z2 



is: 



X3=P Y. 



Z3 



; and 
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A is: 



Zl 



w, 



or 



X4=P -Y4 



X4: 



or 



X4: 



•Y4 



; H; Z*; an inverted nucleotide; an abasic nucleotide; or absent. 

is OH, (CH2)„R•^ (CH2)„NHR'°, (CH2)„ 0R'°, (CH2)„ SR^°; 0(CH2)„R'°; 
0(CH2)„0R'^ 0(CH2)„NR'°, 0(CH2)„SR*''; 0(CH2)„SS(CH2)„0R'°, 0(CH2)„C(0)0R'°, 
NH(CH2)nR'°;NH(CH2)nMl'° ;NH(CH2)nOR'°, NH(CH2)„SR'°; S(CH2)„R'°. S(CH2)„NR'<', 
10 S(CH2)nOR'°. S(CH2)„SR"' 0(CH2CH20)mCH2CH20R'°; 0(CH2CH20)„CH2CH2NHR'° , 
NHCCHiCHzl^mCHzCHzNHR'O; Q-R^", O-Q-R'" N-Q-R'°, S-Q-R'° or -0-. is O, CH2, 
NH, or S. 

X', X^, X^ and X" are each, independently, O or S. 

Y', Y^ Y^ and Y^ are each, independently, OH, 0", 0R^ S, Se, BW, H, NHR', 
1 5 N(R')2 alkyl, cycloalkyl, aralkyl, aryl, or heteroaryl, each of which may be optionally 
substituted. 

Z\ Z^ and are each independentiy O, CH2, NH, or S. Z" is OH, (CH2)„R'°, 
(CH2)nNHR'», (CH2)„ OR'". (CH2)„ SR'"; 0(CH2yi'°; 0(CH2)„0R'°, 0(CH2)„NR"'. 
0(CH2)nSR'°, 0(CH2)„SS(CH2)„0R'°, 0(CH2)„C(0)0R'°; NH(CH2),JI'°; NH(CH2)„NR'° 
20 ;NH(CH2)nOR'°, NH(CH2)„SR'°; S(CH2)„R"', S(CH2)„NR'°, S(CH2)„0R''', S(CH2)„SR'° 
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0(CH2CH20),„CH2CH20R*^ 0(CH2CH20)„,CH2CH2NHR^° , 
NH(CH2CH2NH)„,CH2CH2NHR*^ Q-R^^ N-Q-R^^ S-Q^R^^ 

X is 5-100, chosen to comply with a length for an RNA agent described herein. 

R^ is H; or is together combmed with R\ R^ or R^ to form an [-O-CH2-] covalently 

5 bound bridge between the sugar 2' and 4' carbons. 

R^ is alkyl, cycloalkyl, aryl, aralkyl, heterocyclyl, heteroaryl, amino acid, or sugar; R^ 
is NH2, alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, 
diheteroaryl amino, or amino acid; and R^^ is H; fluorophore (pyrene, TAMRA, fluorescem, 
Cy3 or Cy5 dyes); sulfur, silicon, boron or ester protecting group; intercalating agents (e.g. 

10 acridines), cross-linkers (e.g. psoralene, mitomycin C), porphyrins (TPPC4,texaphyrin, 

Sapphyrin), polycyclic aromatic hydrocarbons (e.g., phenazine, dihydrophenazine), artificial 
endonucleases (e.g. EDTA), lipohilic carriers (cholesterol, cholic acid, adamantane acetic 
acid, 1 -pyrene butyric acid, dihydrotestosterone, l,3-Bis-0(hexadecyl)glycerol, 
geranyloxyhexyl group, hexadecylglycerol, bomeol, menthol, 1,3-propanediol, heptadecyl 

15 group, palmitic acid,myristic acid,03-(oleoyl)lithocholic acid, 03-(oleoyl)cholenic acid, 
dimethoxytrityl, or phenoxazine)and peptide conjugates (e.g., antennapedia peptide. Tat 
peptide), alkylating agents, phosphate, amino, mercapto, PEG (e.g., PEG-40K), MPEG, 
[MPEG]2, polyamino; alkyl, cycloalkyl, aryl, aralkyl, heteroaryl; radiolabelled markers, 
enzymes, haptens (e.g. biotin), transport/absorption facilitators (e.g., aspirin, vitamin E, foUc 

20 acid), synthetic ribonucleases (e.g., imidazole, bisimidazole, histamine, imidazole clusters, 
acridine-imidazole conjugates, Eu3+ complexes of tetraazamacrocycles); or an RNA agent, 
m is 0-1,000,000, and n is 0-20, Q is a spacer selected from the group consisting of abasic 
sugar, amide, carboxy, oxyamine, oxyimine, thioether, disulfide, thiourea, sulfonamide, or 
morpholino, biotm or fluorescem reagents. 
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Preferred KNA agents in which the entire phosphate group has been replaced have the 
following structure (see Formula 3 below): 




FORMULA 3 

Referring to Formula 3, A^^-A"*^ is L-G-L; A^^ and/or A'^^may be absent, in which L 
is a linker, wherein one or both L may be present or absent and is selected from the group 
10 consisting of CH2(CH2)g; N(CH2)g; 0(CH2)g; S(CH2)g. G is a functional group selected from 
the group consisting of siloxane, carbonate, carboxymethyl, carbamate, amide, thioether, 
ethylene oxide linker, sulfonate, sulfonamide, thioformacetal, formacetal, oxime, 
methyleneimino, methylenemethylimino, methylenehydrazo, metiiylenedimethylhydrazo and 
methyleneoxymethylimino. 
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R^", R^°, and R^" are each, independently, H, (Le, abasic nucleotides), adenine, 
guanine, cytosine and uracil, inosine, thymine, xanthine, hypoxanthine, nubularine, 
tubercidine, isoguanisine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine 
and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 5-halouracil and 

5 cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil 

(pseudouracil), 4-thiouracil, 5-halouracil, 5-(2-aminopropyl)uracil, 5-amino allyl uracil, 8- 
halo, amino, thiol, thioalkyl, hydroxyl and other 8 -substituted adenines and guanines, 5- 
trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine, 5-substituted 
pyrimidines, 6-azapyrimidines and N-2, N-6 and 0-6 substituted purines, including 2- 

10 aminopropyladenine, 5-propynyluracil and 5-propynylcytosine, dihydrouracil, 3-deaza-5- 
azacytosine, 2-aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,7-deazaadenine, 
7-deazaguanine, N6, N6-dimethyladenme, 2,6-dianiinopurine, 5-amino-allyl-uracil, N3- 
methyluracil substituted 1,2,4-triazoles, 2-pyridinone, S-nitroindole, 3-nitropyrrole, 5- 
methoxyuracil, uracil-5-oxyacetic acid, 5-methoxycarbonylmethyluracil, 5-methyl-2- 

15 thiouracil, 5-methoxycarbonylmethyl-2-thiouracil, 5-methylaminomethyl-2-thiouracil, 3-(3- 
amino-3carboxypropyl)uracil, 3-methylcytosin6, 5-methylcytosine, N^-acetyl cytosine, 2- 
thiocytosine, N6-methyladenine, N6-isopentyladenine, 2-methylthio-N6-isopentenyladenine, 
N-methylguanines, or 0-alkylated bases. 

R'*^ R^^ and R^° are each, independently, OR^ 0(CH2CH20)mCH2CH20R^ 

20 0(CH2)„R^; 0(CH2)nOR^ H; halo; NH2; NHR^ N(R«)2; NH(CH2CH2NH),nCH2CH2R^ 
NHC(0)R^; cyano; mercapto, SR^; alkyl-thio-alkyl; alkyl, aralkyl, cycloalkyl, aryl, 
heteroaryl, alkenyl, alkynyl, each of which may be optionally substituted with halo, hydroxy, 
0x0, nitro, haloalkyl, alkyl, alkaryl, aryl, aralkyl, alkoxy, aryloxy, amino, alkylamino, 
dialkylamino, heterocyclyl, arylammo, diaryl amino, heteroaryl amino, diheteroaryl amino, 

25 acylamino, alkylcarbamoyl, arylcarbamoyl, aminoalkyl, alkoxycarbonyl, carboxy, 

hydroxyalkyl, alkanesulfonyl, alkanesulfonamido, arenesulfonamido, aralkylsulfonamido, 
alkylcarbonyl, acyloxy, cyano, and ureido groups; or R"*^, R^°, or R^'^ together combine with 
R^*^ to form an [-O-CH2-] covalently bound bridge between the sugar 2' and 4' carbons. 
X is 5-100 or chosen to comply with a length for an RNA agent described herein. 

30 R^° is H; or is together combined with R'^^ R^^ or R^*^ to form an [-O-CH2-] 

covalently bound bridge between the sugar 2' and 4' carbons. 
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is alkyl, cycloalkyl, aryl, aralkyl, heterocyclyl, heteroaryl, amino acid, or sugar; 
and is NH2, alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl 
amino, diheteroaryl amino, or amino acid, m is 0-1,000,000, n is 0-20, and g is 0-2. 

Preferred nucleoside surrogates have the following structure (see Formula 4 below): 

5 

slr^^°-(m-slr^^Vm-slr^°^ 

FORMULA 4 

S is a nucleoside surrogate selected from the group consisting of mophilino, 
10 cyclobutyl, pyrrolidine and peptide nucleic acid. L is a linker and is selected from the group 
consisting of CH2(CH2)g; N(CH2)g; 0(CH2)g; S(CH2)g; -C(0)(CH2)n-or may be absent. M is 
an amide bond; sulfonamide; sulfinate; phosphate group; modified phosphate group as 
described herein; or may be absent. 

j^ioo^ ^200^ j^3oo g^^jj^ independently, H (i.e., abasic nucleotides), adenine, 

15 guanine, cytosine and uracil, inosine, thymine, xanthine, hypoxanthine, nubularine, 

tubercidine, isoguanisine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine 
and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, S-halouracil and 
cytosine, S-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, S-uracil 
(pseudouracil), 4-thiouracil, 5-halouracil, 5-(2-aminopropyl)uracil, 5-amino allyl uracil, 8- 

20 halo, amino, thiol, thioalkyl, hydroxyl and other 8-substituted adenines and guanines, 5- 

trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine, 5-substituted 
pyrimidines, 6-azapyrimidines and N-2, N-6 and 0-6 substituted purines, including 2- 
aminopropyladenine, 5-propynyluracil and 5-propynylcytosme, dihydroxiracil, 3-deaza-5- 
azacytosine, 2-aminopurine, 5-alkyluracil, 7-alkylguanine, 5-alkyl cytosine,7-deazaadenine, 

25 7-deazaguanine, N6, N6-dimethyladenine, 2,6-diaminopurine, 5-amino-allyl-uracil, N3- 
methyluracil substituted 1, 2, 4,-triazoles, 2-pyridinones, 5-nitroindole, 3-nitropyrrole, 5- 
methoxyuracil, uracil-5-oxyacetic acid, S-methoxycarbonylmethyluracil, 5-methyl-2- 
thiouracil, 5-methoxycarbonylmethyl-2-thiouracil, 5-methylaminomethyl-2-thiouracil, 3-(3- 
amino-3carboxypropyl)uracil, 3-methylcytosine, 5-methylcytosine, N^-acetyl cytosine, 2- 

30 thiocytosine, N6-methyladenine, N6-isopentyladenine, 2-methyltliio-N6-isopentenyladenine, 
N-methylguanines, or 0-alkylated bases. 
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X is 5-1 00, or chosen to comply with a length for an RNA agent described herein; and 
g is 0-2. 

Nuclease resistant monomers 

6 In one aspect, the invention features a nuclease resistant monomer, or a an iRNA 

agent which incorporates a nuclease resistant monomer (NMR), such as those described 
herein and those described in copending, co-owned United States Provisional Application 
Serial No. 60/469,612 (Attorney Docket No. 14174-069P01), filed on May 9, 2003, which is 
hereby incorporated by reference. 

10 In addition, the invention includes iRNA agents having a NMR and another element 

described herein. E.g., the invention includes an iRNA agent described herein, e.g., a 
palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent having 
an architecture or structure described herein, an iRNA associated with an amphipathic 

15 delivery agent described herein, an iRNA associated with a drug delivery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates a NMR. 

An iRNA agent can include monomers which have been modifed so as to inhibit 
degradation, e.g., by nucleases, e.g., endonucleases or exonucleases, found in the body of a 

20 subject. These monomers are referred to herein as NRM's, or nuclease resistance promoting 
monomers or modifications. In many cases these modifications will modulate other 
properties of the iRNA agent as well, e.g., the ability to interact with a protein, e.g., a 
transport protein, e.g., serum albumin, or a member of the RISC (RNA-induced Silencing 
Complex), or the ability of the first and second sequences to form a duplex with one another 

25 or to form a duplex v/ith another sequence, e.g., a target molecule. 

While not wishing to be boimd by theory, it is believed that modifications of the 
sugar, base, and/or phosphate backbone in an iRNA agent can enhance endonuclease and 
exonuclease resistance, and can enhance interactions with transporter proteins and one or 
more of the fimctional components of the RISC complex. Preferred modifications are those 

30 that increase exonuclease and endonuclease resistance and thus prolong the haUlife of the 
iRNA agent prior to interaction with the RISC complex, but at the same time do not render 
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the iRNA agent resistant to endonuclease activity in the RISC complex. Again, while not 
wishing to be bound by any theory, it is believed that placement of the modifications at or 
near the 3' and/or 5' end of antisense strands can result in iRNA agents that meet the 
preferred nuclease resistance criteria delineated above. Again, still while not wishing to be 

5 bound by any theory, it is believed that placement of the modifications at e.g., the middle of a 
sense strand can result in iRNA agents that are relatively less likely to undergo off-targeting. 

Modifications described herein can be incorporated into any double-standed RNA and 
RNA-like molecule described herein, e.g., an iRNA agent. An iRNA agent may include a 
duplex comprising a hybridized sense and antisense strand, in which the antisense strand 

10 and/or the sense strand may include one or more of the modifications described herein. The 
anti sense strand may include modifications at the 3' end and/or the 5' end and/or at one or 
more positions that occur 1-6 (e.g., 1-5, 1-4, 1-3, 1-2) nucleotides firom either end of the 
strand. The sense strand may include modifications at the 3* end and/or tiie 5* end and/or at 
any one of the intervening positions between the two ends of the strand. The iRNA agent 

16 may also include a duplex comprising two hybridized antisense strands. The first and/or the 
second antisense strand may include one or more of the modifications described herein. 
Thus, one and/or both antisense strands may include modifications at the 3' end and/or the 5' 
end and/or at one or more positions that occiir 1-6 (e.g., 1-5, 1-4, 1-3, 1-2) nucleotides from 
either end of the strand. Particular configurations are discussed below. 

20 Modifications that can be usefiil for producing iRNA agents that meet the preferred 

nuclease resistance criteria delineated above can include one or more of the following 
chemical and/or stereochemical modifications of the sugar, base, and/or phosphate backbone: 

(i) chiral (S?) thioates. Thus, preferred NRM's include nucleotide dimers with an 
enriched or pure for a particular chiral form of a modified phosphate group containing a 

25 heteroatom at the nonbridging position, e.g., Sp or Rp, at the position X, where this is the 
position normally occupied by the oxygen. The atom at X can also be S, Se, Nr2, or Bra. 
When X is S, enriched or chirally pure Sp linkage is preferred. Enriched means at least 70, 
80, 90, 95, or 99% of the preferred form. Such NRM's are discussed in more detail below; 

(ii) attachment of one or more cationic groups to the sugar, base, and/or the 

30 phosphorus atom of a phosphate or modified phosphate backbone moiety. Thus, preferred 
NRM's include monomers at the terminal position derivitized at a cationic group. As the 5' 
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end of an antisense sequence should have a terminal -OH or phosphate group this NRM is 
preferraly not used at th 5 ' end of an anti-sense sequence. The group should be attached at a 
position on the base which minimizes intererence with H bond formation and hybridization, 
e.g., away form the face which intereacts with the complementary base on the other strand, 
5 e.g, at the 5' position of a pyrunidine or a 7-position of a purine. These are discussed in 
more detail below; 

(iii) nonphosphate linkages at the termini. Thus, preferred NRM's include Non- 
phosphate linkages, e.g., a linkage of 4 atoms which confers greater resistance to cleavage 
than does a phosphate bond. Examples include 3' CH2-NCH3-0-CH2-5' and 3' CH2-NH- 

10 (0=)-CH2-5'.; 

(iv) 3*-bridging thiophosphates and 5 '-bridging thiophosphates. Thus, preferred 

NRM's can inlcuded these structures; 

(v) L-RNA, 2'-5' likages, inverted linkages, a-nucleosides. Thus, other preferred 
NRM's include: L nucleosides and dimeric nucleotides derived from L-nucleosides; 2'-5' 

16 phosphate, non-phosphate and modified phosphate linkages (e.g., thiophospahtes, 

phosphoramidates and boronophosphates); dimers having inverted Imkages, e.g., 3'-3' or 5'- 
5' linkages; monomers having an alpha linkage at the 1' site on the sugar, e.g., the structures 
described herein having an alpha linkage; 

(vi) conjugate groups. Thus, preferred NRM's can include e.g., a targeting moiety or 
20 a conjugated ligand described herein conjugated with the monomer, e.g., through the sugar , 

base, or backbone ; 

(vi) abasic linkages. Thus, preferred NRM's can include an abasic monomer, e.g., an 
abasic monomer as described herein (e.g., a nucleobaseless monomer); an aromatic or 
heterocyclic or polyheterocyclic aromatic monomer as described herein.; and 

25 (vii) 5'-phosphonates and 5'-phosphate prodrugs. Thus, preferred NRM's include 

monomers, preferably at the terminal position, e.g., the 5' position, in which one or more 
atoms of the phosphate group is derivatized with a protecting group, which protecting group 
or groups, are removed as a result of the action of a component in the subject's body, e.g, a 
carboxyesterase or an enzyme present in the subject's body. E.g., a phosphate prodrug in 

30 which a carboxy esterase cleaves the protected molecule resulting in the production of a 
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thioate anion which attacks a carbon adjacent to the O of a phosphate and resulting in the 
production of an uprotected phosphate. 

One or more different NRM modifications can be introduced into an iKNA agent or 
into a sequence of an iRNA agent. An NRM modification can be used more than once in a 
5 sequence or in an iRNA agent. As some NRM's interfere with hybridization the total 

number incorporated, should be such that acceptable levels of iRNA agent duplex formation 
are maintainted. 

In some embodiments NRM modifications are introduced into the terminal the 
cleavage site or in the cleavage region of a sequence (a sense strand or sequence) which does 
10 not target a desired sequence or gene in the subject. This can reduce off-target silencing. 

CMral Sp Thioates 

A modification can include the alteration, e.g., replacement, of one or both of the 
non-linking (X and Y) phosphate oxygens and/or of one or more of the linking (W and Z) 
1 5 phosphate oxygens. Formula X below depicts a phosphate moiety linking two sugar/sugar 
surrogate-base moities, SBi and SB2. 



W 



SB 



1 



SB2 



20 



25 



FORMULA X 



In certain embodiments, one of the non-linking phosphate oxygens in the phosphate 
backbone moiety (X and Y) can be replaced by any one of the following: S, Se, BR3 (R is 
hydrogen, allcyl, aryl, etc.), C (i.e., an alkyl group, an aryl group, etc.), H, NR2 (R is 
hydrogen, alkyl, aryl, etc.), or OR (R is alkyl or aryl). The phosphorus atom in an 
urmiodified phosphate group is achiral. However, replacement of one of the non-linking 
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oxygens with one of the above atoms or groups of atoms renders the phosphorus atom chiral; 
in other words a phosphorus atom in a phosphate group modified in this way is a stereogenic 
center. The stereogenic phosphorus atom can possess either the "R" configuration (herein 
Rp) or the "S" configuration (herein Sp). Thus if 60% of a population of stereogenic 
phosphorus atoms have the Rp configuration, then the remaming 40% of the population of 
stereogenic phosphorus atoms have tiie Sp configuration. 

In some embodiments, iRNA agents, having phosphate groups in which a phosphate 
non-linking oxygen has been replaced by another atom or group of atoms, may contain a 
population of stereogenic phosphorus atoms in which at least about 50% of these atoms (e.g., 
at least about 60% of these atoms, at least about 70% of these atoms, at least about 80% of 
these atoms, at least about 90% of these atoms, at least about 95% of these atoms, at least 
about 98% of these atoms, at least about 99% of these atoms) have the S? configuration. 
Alternatively, iRNA agents having phosphate groups m which a phosphate non-lmking 
oxygen has been replaced by another atom or group of atoms may contain a population of 
stereogenic phosphorus atoms in which at least about 50% of these atoms (e.g., at least about 
60% of these atoms, at least about 70% of these atoms, at least about 80% of these atoms, at 
least about 90% of these atoms, at least about 95% of these atoms, at least about 98% of 
these atoms, at least about 99% of these atoms) have the Rp configuration. In other 
embodiments, the population of stereogenic phosphorus atoms may have the Sp 
configuration and may be substantially fi-ee of stereogenic phosphorus atoms having the Rp 
configuration. In still other embodiments, the population of stereogenic phosphorus atoms 
may have the Rp configuration and may be substantially fi'ee of stereogenic phosphorus 
atoms having the Sp configuration. As used herein, the phrase "substantially fi-ee of 
stereogenic phosphorus atoms having the Rp configuration" means that moieties containing 
stereogenic phosphorus atoms havmg the Rp configuration cannot be detected by 
conventional methods known in the art (chkal HPLC, 'H MMR analysis using chiral shift 
reagents, etc.). As used herein, the phrase "substantially fiiee of stereogenic phosphorus 
atoms having the Sp configuration" means that moieties containing stereogenic phosphorus 
atoms having the Sp configuration cannot be detected by conventional methods known in the 
art (chiral HPLC, NMR analysis using chkal shift reagents, etc.). 
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In a preferred embodiment, modified iRNA agents contain a phosphorothioate group, 
i.e., a phosphate groups in which a phosphate non-iinking oxygen has been replaced by a 
sulfur atom. In an especially prefened embodiment, the population of phosphorothioate 
stereogenic phosphorus atoms may have the Sp configuration and be substantially free of 
stereogenic phosphorus atoms havmg the Rp configuration. 

Phosphorothioates may be incorporated into iRNA agents using dimers e.g., formulas 
X-1 and X-2. The former can be used to introduce phosphorotiiioate 



DMTO 



BASE 




DMTO 



BASE 



»olid phase reagent 



BASE 




BASE 



X-1 



X-2 



at the 3 ' end of a strand, while the latter can be used to mtroduce this modification at the 5 ' 
end or at a position that occurs e.g., 1, 2, 3, 4, 5, or 6 nucleotides from either end of the 
strand. In the above formulas, Y can be 2-cyanoethoxy, W and Z can be O, R2> can be, e.g., i 
substituent that can impart the C-3 endo configuration to the sugar (e.g., OH, F, OCH3), 
DMT is dimethoxytrityl, and "BASE" can be a natural, unusual, or a universal base. 
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X-1 and X-2 can be prepared using chiral reagents or directing groups that can result 
in phosphorothioate-containing dimers having a population of stereogenic phosphorus atoms 
having essentially only the Rp configuration (i.e., being substantially free of the S? 
configuration) or only the S? configuration (i.e., being substantially free of the Rp 

5 configuration). Alternatively, dimers can be prepared having a population of stereogenic 
phosphorus atoms in which about 50% of the atoms have the Rp configuration and about 
50% of the atoms have the Sp configuration. Dimers having stereogenic phosphorus atoms 
with the Rp configuration can be identified and separated from dimers havmg stereogenic 
phosphorus atoms with the Sp configuration using e.g., enzymatic degradation and/or 

1 0 conventional chromatography techniques . 

Cationic Groups 

Modifications can also include attachment of one or more cationic groups to the 
sugar, base, and/or the phosphorus atom of a phosphate or modified phosphate backbone 
moiety. A cationic group can be attached to any atom capable of substitution on a natural, 

15 unusual or universal base. A preferred position is one that does not interfere with 

hybridization, i.e., does not interfere with the hydrogen bonding interactions needed for base 
pairing, A cationic group can be attached e.g., through the C2' position of a sugar or 
analogous position in a cyclic or acyclic sugar sunogate. Cationic groups can mclude e.g., 
protonated amino groups, derived from e.g., 0-AMlNE (AMINE - NH2; alkylamino, 

20 dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, or diheteroaryl 

amino, ethylene diamine, polyamino); aminoalkoxy, e.g., 0(CH2)nAMINE, (e.g., AMINE = 
NH2; alkylamino, dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino, or 
diheteroaryl ammo, ethylene diamine, polyamino); amino (e.g. NH2; alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl ammo, heteroaryl amino, diheteroaryl amino, 

25 or amino acid); or NH(CH2CH2NH)nCH2CH2-AMINE (AMINE = NH2; alkylamino, 
dialkylamino, heterocyclyl, arylamino, diaryl amino, heteroaryl amino,or diheteroaryl 
amino). 

Nonphosphate Linkages 

Modifications can also include the incorporation of nonphosphate linkages at the 5' 
30 and/or 3 * end of a strand. Examples of nonphosphate linkages which can replace the 
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phosphate group include methyl phosphonate, hydroxylammo, siloxane, carbonate, 
carboxymethyl, carbamate, amide, thioether, ethylene oxide linker, sulfonate, sulfonamide, 
thioformacetal, formacetal, oxime, methyleneimmo, methylenemethylimino, 
methylenehydrazo, methylenedunethylhydrazo and methyleneoxymethylimino. Preferred 
replacements include the methyl phosphonate and hydroxylamino groups. 

3 '-bridging thiophosphates and 5 '-bridging thiophosphates; locked-RNA, 2 '-5 * 
likages, inverted linkages, o^nucleosides; conjugate groups; abasic linkages; and 5 - 
phosphonates and 5 '-phosphate prodrugs 

Referring to formula X above, modifications can include replacement of one of the 
bridging or linking phosphate oxygens in the phosphate backbone moiety (W and Z), Unlike 
the situation where only one of X or Y is altered, the phosphorus center in the 
phosphorodithioates is achiral which precludes the formation of iRNA agents containing a 

stereogenic phosphorus atom.. 

Modifications can also include Imking two sugars via a phosphate or modified 
phosphate group through the 2' position of a first sugar and the 5' position of a second sugar. 
Also contemplated are inverted linkages in which both a first and second sugar are cached 
linked through the respective3' positions. Modified RNA's can also include "abasic" sugars, 
which lack a nucleobase at C-l'. The sugar group can also contain one or more carbons that 
possess the opposite stereochemical configuration than that of the corresponding carbon in 
ribose. Thus, a modified iRNA agent can include nucleotides containing e.g., afabinose, as 
the sugar. In another subset of this modification, the natural, unusual, or universal base may 
have the a-configuration. Modifcations can also include L-RNA. 

Modifications can also include 5 '-phosphonates, e.g., P(0)(0")2-X-C^ -sugar (X== 
CH2, CF2, CHF and 5'-phosphate prodrugs, e.g., P(0)[OCH2CH2SC(0)R]2CH2C5'.sugar. 
In the latter case, the prodrug groups may be decomposed via reaction first with carboxy 
esterases. The remaining ethyl thiolate group via intramolecular Sn2 displacement can depart 
as episulfide to afford the underivatized phosphate group. 

Modification can also include the addition of conjugating groups described elseqhere 
herein, which are prefereably attached to an iRNA agent tiirough any amino group available 
for conjugation. 
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Nuclease resistant modifications include some which can be placed only at the 
terminus and others which can go at any position. Generally the modifications that can 
inhibit hybridization so it is preferably to use them only in terminal regions, and prefeirable 
to not use them at the cleavage site or in the cleavage region of an sequence which targets a 
subject sequence or gene.. The can be used anywhere in a sense sequence, provided that 
sufficient hybridization between the two sequences of the iKNA agent is maintained. In 
some embodiments it is desirabable to put the NRM at the cleavage site or in the cleavage 
region of a sequence which does not target a subject sequence or gene,as it can minimize off- 
target silencing. 

In addition, an iRNA agent described herein can have an overhang which does not 
form a duplex structure with the other sequence of the iRNA agent— it is an overhang, but it 
does hybridize, either with itself, or with another nucleic acid, other than the other sequence 
ofthe iRNA agent. 

In most cases, the nuclease-resistance promoting modifications wUl be distributed 
differently depending on whelher the sequence will target a sequence in the subject (often' 
referred to as an anti-sense sequence) or will not target a sequence in the subject (often 
referred to as a sense sequence). If a sequence is to target a sequence in the subject, 
modifications which interfer with or inhibit endonuclease cleavage should not be inserted in 
the region which is subject to RISC mediated cleavage, e.g., the cleavage site or the cleavage 
region (As described in Elbashir et al, 2001, Genes and Dev. 15: 188, hereby incorporated 
by reference, cleavage ofthe target occurs about in the middle of a 20 or 21 nt guide RNA, or 
about 10 or 1 1 nucleotides upstream ofthe first nucleotide which is complementary to the 
guide sequence. As used herein cleavage site refers to the nucleotide on either side ofthe 
cleavage site, on the target or on the iRNA agent strand which hybridizes to it. Cleavage 
region means an nucleotide with 1, 2, or 3 nucletides ofthe cleave site, in either direction.) 

Such modifications can be introduced into the terminal regions, e.g., at tiie terminal 
position or with 2, 3, 4, or 5 positions of die terminus, of a sequence which targets or a 
sequence which does not target a sequence in the subject. 

An iRNA agent can have a first and a second strand chosen firom the following: 

a first strand which does not target a sequence and which has an NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end; 
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a first strand which does not target a sequence and which has an NRM modification at 
or witiiin 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a first strand which does not target a sequence and which has an NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a first strand which does not target a sequence and which has an NRM modification at 

the cleavage site or in the cleavage region; 

a first strand which does not target a sequence and which has an NRM modification at 
the cleavage site or in the cleavage region and one or more of an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at or within 1, 2, 3, 
4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 3, 4, 5 , or 6 
positions from both the 3' and the 5' end; and 

a second strand which targets a sequence and which has an NRM modification at or 

within 1, 2, 3, 4, 5 , or 6 positions from the 3' end; 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end (5' end NRM modifications are 
preferentiaUy not at the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from the 5' 
terminus of an antisense strand); 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end; 

a second strand which targets a sequence and which preferably does not have an an 
NRM modification at tiie cleavage site or in the cleavage region; 

a second strand which targets a sequence and which does not have an NRM 
modification at tiie cleavage site or in the cleavage region and one or more of an NRM 
modification at or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at 
or within 1, 2, 3, 4, 5 , or 6 positions from the 5' end, or NRM modifications at or witiiin 1, 2, 
3, 4, 5 , or 6 positions from both tiie 3' and tiie 5' end(5' end NRM modifications are 
preferentially not at tiie terminus but ratiier at a position 1, 2, 3, 4, 5 , or 6 away from tiie 5' 
tominus of an antisense strand). 
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An iRNA agent can also target two sequences and can have a first and second strand 
chosen from: 

a first strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions firom the 3' end; 

a first strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions firom the 5' end (5' end NRM modifications are 
preferentially not at the termmus but rather at a position 1, 2, 3, 4, 5 , or 6 away firom the 5' 
terminus of an antisense strand); 

a first strand which targets a sequence and which has an NRM modification at or 
within 1 , 2, 3, 4, 5 , or 6 positions from the 3' end and which has a NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5* end; 

a first strand which targets a sequence and which preferably does not have an an 
NRM modification at the cleavage site or in the cleavage region; 

a first strand which targets a sequence and which dose not have an NRM modification 
at the cleavage site or in the cleavage region and one or more of an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at or within 1, 2, 3, 
4, 5 , or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 3, 4, 5 , or 6 
positions from both the 3' and the 5' end(5' end NRM modifications are preferentially not at 
the terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from tiie 5 ' terminus of an 

antisense strand) and 

a second strand which targets a sequence and which has an NRM modification at or 

within 1, 2, 3, 4, 5 , or 6 positions from the 3' end; 

a second strand which targets a sequence and which has an NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions from the 5' end (5' end NRM modifications are 
preferentially not at tiie terminus but rather at a position 1, 2, 3, 4, 5 , or 6 away from the 5' 

terminus of an antisense strand); 

a second strand which targets a sequence and which has an NRM modification at or 
within 1 , 2, 3, 4, 5 , or 6 positions from tiie 3' end and which has a NRM modification at or 
within 1, 2, 3, 4, 5 , or 6 positions fi'om the 5' end; 

a second strand which targets a sequence and which preferably does not have an an 
NRM modification at the cleavage site or in the cleavage region; 
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a second strand which targets a sequence and which dose not have an MIM 
modification at the cleavage site or in the cleavage region and one or more of an NRM 
modification at or within 1, 2, 3, 4, 5 , or 6 positions from the 3' end, a NRM modification at 
or within 1, 2, 3, 4, 5 . or 6 positions from the 5' end, or NRM modifications at or within 1, 2, 
3, 4, 5 , or 6 positions from both the 3' and the 5' end(5' end NRM modifications are 
preferentially not at the terminus but rather at a position 1 , 2, 3, 4, 5 , or 6 away firom the 5' 
terminus of an antisense strand). 



Ribose Mimics 

In one aspect, the invention features a ribose mimic, or an iRNA agent which 
incorporates a ribose mimic, such as those described herein and those described in copending 
co-owned United States Provisional AppUcation Serial No. 60/454,962 (Attorney Docket No. 
14174-064P01), filed on March 13, 2003, which is hereby incorporated by reference. 

In addition, the invention includes iKNA agents having a ribose mimic and another 
element described herein. E.g., the invention includes an iRNA agent described herein, e.g., 
a palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent having 
an architecture or structure described herein, an iRNA associated with an amphipathic 
delivery agent described herein, an iRNA associated with a drug deUvery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates a ribose mimic. 

Thus, an aspect of the invention features an iRNA agent tiiat includes a secondary 
hydroxyl group, which can increase efficacy and/or confer nuclease resistance to the agent. 
Nucleases, e.g., cellular nucleases, can hydrolyze nucleic acid phosphodiester bonds, 
resulting in partial or complete degradation of tiie nucleic acid. The secondary hydroxy 
group confers nuclease resistance to an iRNA agent by rendering tiie iRNA agent less prone 
to nuclease degradation relative to an iRNA which lacks the modification. While not 
wishing to be bound by tiieory, it is beUeved tiiat the presence of a secondary hydroxyl group 
on tiie iRNA agent can act as a structural mimic of a 3' ribose hydroxyl group, tiiereby 
causing it to be less susceptible to degradation. 
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The secondary hydroxyl group refers to an "OH" radical that is attached to a carbon 
atom substituted by two other carbons and a hydrogen. The secondary hydroxyl group that 
confers nuclease resistance as described above can be part of any acyclic carbon-containing 
group. The hydroxyl may also be part of any cyclic carbon-containing group, and preferably 
one or more of the following conditions is met (1) there is no ribose moiety between the 
hydroxyl group and the terminal phosphate group or (2) the hydroxyl group is not on a sugar 
moiety which is coupled to a base.. The hydroxyl group is located at least two bonds (e.g., at 
least three bonds away, at least four bonds away, at least five bonds away, at least six bonds 
away, at least seven bonds away, at least eight bonds away, at least nine bonds away, at least 
ten bonds away, etc.) firom the terminal phosphate group phosphorus of the iKNA agent. In 
preferred embodiments, there are five intervening bonds between the terminal phosphate 
group phosphorus and the secondary hydroxyl group. 

Preferred iRNA agent delivery modules with five intervening bonds between the 
terminal phosphate group phosphorus and the secondary hydroxyl group have the following 
structure (see formula Y below): 



W 



R2 



n 



OR7 



Re 



NHT 



Rs 



(Y) 



Referring to formula Y, A is an iRNA agent, including any iKNA agent described 
herein. The iRNA agent may be connected directly or indirectly (e.g., through a spacer or 
linker) to "W" of the phosphate group. These spacers or linkers can include e.g., -(CH2)n-, 
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(CH2)nN-, -(CH2)„0-, -(CH2)„S-. 0(CH2CH20)„CH2CH20H (e.g., n = 3 or 6), abasic sugars, 
amide, carboxy, amine, oxyamine, oxyimine, thioether, disulfide, thiourea, sulfonamide, or 
moipholino, or biotin and fluorescein reagents. 

The iRNA agents can have a terminal phosphate group that is unmodified (e.g., W, X, 
Y, and Z are O) or modified. In a modified phosphate group, W and Z can be independently 
NH, O, or S; and X and Y can be independently S, Se, BHj", Ci-Ce alkyl, Ce-Cio aryl, H, O, 
alkoxy or amino (including alkylamino, arylamino, etc.). Preferably, W, X and Z are O 
and Y is S. 

R, and R3 are each, independenUy, hydrogen; or C,-Cioo alkyl, optionally substituted 
with hydroxyl, amino, halo, phosphate or sulfate and/or may be optionally inserted with N, 

0, S, alkenyl or alkynyl. 

R2 is hydrogen; Ci-Cioo alkyl, optionally substituted with hydroxyl, amino, halo, 
phosphate or sulfate and/or may be optionally inserted with N, O, S, alkenyl or alkynyl; or, 
when n is 1 , R2 may be taken together with with R4 or Re to form a ring of 5-12 atoms. 

R4 is hydrogen; Ci-Cioo alkyl, optionally substituted with hydroxyl, amino, halo, 
phosphate or sulfate and/or may be optionally inserted with N, O, S, alkenyl or alkynyl; or, 
when n is 1, R4 may be taken together with with R2 or R5 to form a ring of 5-12 atoms. 

R5 is hydrogen, Ci-Cioo alkyl optionally substituted with hydroxyl, amino, halo, 
phosphate or sulfate and/or may be optionally inserted with N, O, S, alkenyl or alkynyl; or, 
when n is 1, Rj may be taken together with with R, to form a ring of 5-12 atoms. 

Rfi is hydrogen, Ci-Cioo alkyl, optionally substituted witii hydroxyl, amino, halo, ■ 
phosphate or sulfate and/or may be optionally inserted with N, O, S, alkenyl or alkynyl, or, 
when n is 1 , R6 may be taken together with witii R2 to form a ring of 6- 1 0 atoms; 

R7 is hydrogen, Ci-Coo alkyl, or C(0)(CH2)qC(0)NHR9; T is hydrogen or a 
functional group; n and q are each independentiy 1-100; Rg is Ci-Cio alkyl or Ce-Cio aryl; 
and R9 is hydrogen, Cl-ClO alkyl, C6-C10 aryl or a soUd support agent. 

Preferred embodiments may include one of more of tiie following subsets of iRNA 

agent delivery modules. 

In one subset of RNAi agent delivery modules, A can be connected directly or 
indirectly through a terminal 3' or 5' ribose sugar carbon of tiie RNA agent. 

In another subset of RNAi agent delivery modules, X, W, and Z are O and Y is S. 
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In Still yet another subset of RNAi agent delivery modules, n is 1 , and R2 and Re are 
taken together to form a ring containing six atoms and R4 and R5 are taken together to form a 
ring containmg six atoms. Preferably, the ruig system is a rra/w-decalin. For example, the 
RNAi agent delivery module of this subset can include a compound of Formula (Y-1): 

5 

A 




The functional group can be, for example, a targeting group (e.g., a steroid or a 
carbohydrate), a reporter group (e.g., a fluorophore), or a label (an isotopically labelled 
moiety). The targeting group can further include protein binding agents, endothelial cell 
10 targeting groups (e.g., RGD peptides and mimetics), cancer cell targeting groups (e.g., folate 
Vitamin B12, Biotin), bone cell targeting groups (e.g., bisphosphonates, polyglutamates, 
polyaspartates), multivalent mannose (for e.g., macrophage testing), lactose, galactose, N- 
acetyl-galactosamine, monoclonal antibodies, glycoproteins, lectins, melanotropin, or 
thyrotropin. 

15 As can be appreciated by the skilled artisan, methods of synthesizing the compounds 

of the formulae herein will be evident to those of ordinary skill in the art.The synthesized 
compounds can be separated from a reaction mixture and further purified by a method such 
as column chromatography, high pressure liquid chromatography, or recrystallization. 
Additionally, the various synthetic steps may be performed in an alternate sequence or order 

20 to give the desired compounds. Synthetic chemistry transformations and protecting group 
methodologies (protection and deprotection) useful in synthesizing the compounds described 
herein are known in the art and include, for example, those such as described in R. Larock, 
Comprehensive Organic Transformations, VCH Publishers (1989); T.W. Greene and P.QM. 
Wuts, Protective Groups in Organic Synthesis, 2d. Ed., John Wiley and Sons (1991); L. 

25 Fieser and M. Fieser, Fieser and Fieser's Reagents for Organic Synthesis, John Wiley and 
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Sons (1994); and L. Paquette, ed., Encyclopedia of Reagents for Organic Synthesis, John 
Wiley and Sons (1995), and subsequent editions thereof. 

Ribose Replacement Monomer Subunits 

iRNA agents can be modified in a niunber of ways which can optimize one or more 
characteristics of the iRNA agent. In one aspect, the invention features a ribose replacement 
monomer subunit (RRMS), or a an iKNA agent which incorporates a RRMS, such as those 
described herein and those described in one or more of United States Provisional Application 
Serial No. 60/493,986 (Attorney Docket No. 14174-079P01), filed on August 8, 2003, which 
is hereby incorporated by reference; United States Provisional Application Serial No. 
60/494,597 (Attorney Docket No. 14174-080P01), filed on August 11, 2003, which is hereby 
incorporated by reference; United States Provisional Application Serial No. 60/506,341 
(Attorney Docket No. 14174^080P02), filed on September 26, 2003, which is hereby 
incorporated by reference; and in United States Provisional Application Serial No. 
60/158,453 (Attorney Docket No. 14174-080P03), filed on November 7, 2003, which is 
hereby incorporated by reference. 

In addition, the mvention includes iRNA agents having a RRMS and another element 
described herein. E.g., the invention includes an iRNA agent described herein, e.g., a 
palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herem, e.g., a gene active in the liver, an iRNA agent having 
an archtecture or structure described herein, an iRNA associated with an amphipathic 
delivery agent described herein, an iRNA associated with a drug delivery module described 
herem, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates a RRMS. 

The ribose sugar of one or more ribonucleotide subvmits of an iRNA agent can be 
replaced with another moiety, e.g., a non-carbohydrate (preferably cyclic) carrier, A 
ribonucleotide subunit in which the ribose sugar of the subunit has been so replaced is 
referred to herein as a ribose replacement modification subunit (RRMS). A cyclic carrier 
may be a carbocyclic ring system, i.e., all ring atoms are carbon atoms, or a heterocyclic ring 
system, i.e., one or more ring atoms may be a heteroatom, e.g., nitrogen, oxygen, sulfur. The 
cyclic carrier may be a monocyclic ring system, or may contain two or more rings, e.g. fused 
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rings. The cyclic carrier may be a fully saturated ring system, or it may contain one or more 
double bonds. 

The carriers further mclude (i) at least two **backbone attachment points" and (ii) at 
least one 'tethering attachment point" A "backbone attachment point" as used herein refers 
5 to a functional group, e.g. a hydroxyl group, or generally, a bond available for, and that is 
suitable for incorporation of the carrier into the backbone, e.g., the phosphate, or modified 
phosphate, e.g., sulfur containing, backbone, of a ribonucleic acid. A "tethering attachment 
point" as used herein refers to a constituent ring atom of the cyclic carrier, e.g., a carbon 
atom or a heteroatom (distinct from an atom which provides a backbone attachment point), 

10 that coimects a selected moiety. The moiety can be, e.g., a ligand, e.g., a targeting or 

delivery moiety, or a moiety which alters a physical property, e.g., lipophilicity, of an iRNA 
agent. Optionally, the selected moiety is connected by an mtervening tether to the cyclic 
carrier. Thus, it will include a functional group, e.g., an amino group, or generally, provide a 
bond, that is suitable for incorporation or tethering of another chemical entity, e.g., a ligand 

15 to the constituent ring. 

Incorporation of one or more RRMSs described herein into an RNA agent, e.g., an 
iRNA agent, particularly when tethered to an appropriate entity, can confer one or more new 
properties to the RNA agent and/or alter, enhance or modulate one or more existing 
properties in the RNA molecule. E.g., it can alter one or more of lipophilicity or nuclease 

20 resistance. Incorporation of one or more RRMSs described herein into an iRNA agent can, 
particularly when the RRMS is tethered to an appropriate entity, modulate, e.g., increase, 
binding afiBnity of an iRNA agent to a target mRNA, change the geometry of the duplex 
form of the iRNA agent, alter distribution or target the iRNA agent to a particular part of the 
body, or modify the interaction with nucleic acid binding proteins (e.g., during RISC 

25 formation and strand separation). 

Accordingly, in one aspect, the invention features, an iRNA agent preferably 
comprising a first strand and a second strand, wherein at least one subunit having a foimula 
(R-1) is incorporated into at least one of said strands. 
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(R-1) 



R 



R" 



R^ R® 

2 V^^^ 



7^ 



R' 



R" 



Refeiring to fonnula (R-1), X is N(C0)R', NR' or CH2; Y is ^IR^ O, S, CR^'°, or 
5 absent; and Z is CR' ^R'^ or absent 

Each of R', R^ R^ R^ R', and R"" is, independently, H, OR*, 0R^ (CH2)„0R", or 
(CH2)„0R^ provided that at least one of R', R^ R^ R*, R", and R^" is OR' or OR'' and that at 
least one of R', R^ R\ R\ R^ and R'" is (CH2)„0R^ or (CH2)„0R'' (when the RRMS is 
terminal, one of R', R^, R^ R", R', and R'° will include R' and one will include R*"; when the 
10 RRMS i& internal, two of R', R^ R^ R*, R', and R^° will each include an R*"); further 

provided that preferably OR" may only be present with (CH2)„0R'' and (CH2)„0R" may only 
be present with OR"*. 

Each of R^ R^ R", and r'^ is, independently, H, Ci-Ce alkyl optionally substituted 
with 1-3 R", or C(0)NHR^; or R' and R" together are Ca-Cg cycloalkyl optionally 
1 5 substituted with R'^. 

R' is C1-C20 alkyl substituted with NR^'R''; R* is Ci-Cg alkyl; R" is hydroxy, C1-C4 
alkoxy, or halo; and R''' is NR*U^. 



20 



RMs: 



■B 



; and 
R''is: 
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■Strand 



Each of A and C is, independently, O or S, 
B is OH, O', or 



O 



•OH 



O" 

is H or C1-C6 alkyl; is H or a ligand; and n is 1-4. 
In a preferred embodiment the ribose is replaced with a pyrroUne scaffold, and X is 
N(CO)R^ or NR^ Y is CR^R^°, and Z is absent. 
10 In other preferred embodiments the ribose is replaced with a piperidine scaffold, and 

X is N(CO)R^ or NR^ Y is CR^R'^ and Z is CR^ ^R^l 

In other preferred embodiments the ribose is replaced with a piperazine scaffold, and 
X is N(CO)R^ or NR^ Y is NR^ and Z is CR* ^R^l 

In other preferred embodiments the ribose is replaced with a morpholino scaffold, and 
16 X is N(CO)R^ or NR^ Y is O, and Z is CR^ *R^^ . 

In other preferred embodiments the ribose is replaced with a decalin scaffold, and X 
isCHa; Y is CR'^R*^; and Z is CR^^R^^ and R^ and R^^ together are cycloalkyl. 

In other preferred embodiments the ribose is replaced with a decalin/indane scafold 
and , and X is CH2; Y is CR^^^ and Z is CR^^R^^ and R^ and R^^ together are 
20 cycloalkyl. 

In other preferred embodiments, the ribose is replaced with a hydroxyproline 
scaffold. 

RRMSs described herein may be incorporated into any double-stranded RNA-like 
molecule described herein, e.g., an iRNA agent. An iRNA agent may include a duplex 
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comprising a hybridized sense and antisense strand, in which the antisense strand and/or the 
sense strand may include one or more of the RRMSs described herein. An RRMS can be 
introduced at one or more points m one or both strands of a double-stranded iRNA agent. An 
RRMS can be placed at or near (within 1, 2, or 3 positions) of the 3' or 5' end of the sense 
strand or at near (within 2 or 3 positions of) the 3' end of the antisense strand. In some 
embodiments it is preferred to not have an RRMS at or near (within 1, 2, or 3 positions of) 
the 5' end of the antisense strand. An RRMS can be internal, and will preferably be 
positioned in regions not critical for antisense binding to the target 

In an embodiment, an iRNA agent may have an RRMS at (or within 1 , 2, or 3 
positions of) the 3' end of the antisense strand. In an embodiment, an iRNA agent may have 
an RRMS at (or within 1, 2, or 3 positions of) the 3 ' end of the antisense strand and at (or 
within 1, 2, or 3 positions of) the 3' end of the sense strand, In an embodunent, an iRNA 
agent may have an RRMS at (or within 1, 2, or 3 positions of) the 3' end of the antisense 
strand and an RRMS at the 5' end of the sense strand, in which both ligands are located at the 
same end of the iRNA agent. 

In certain embodiments, two ligands are tethered, preferably, one on each strand and 
are hydrophobic moieties. While not wishing to be bound by theory, it is believed that 
pairing of the hydrophobic ligands can stabilize the iRNA agent via intermolecular van der 
Waals interactions. 

In an embodiment, an iRNA agent may have an RRMS at (or within 1 , 2, or 3 
positions of) the 3' end of the antisense strand and an RRMS at the 5' end of the sense strand, 
in which both RRMSs may share the same ligand (e.g., cholic acid) via connection of their 
individual tethers to separate positions on the ligand A ligand shared between two proximal 
RRMSs is referred to herein as a "hairpin ligand." 

In other embodiments, an iRNA agent may have an RRMS at the 3* end of the sense 
strand and an RRMS at an internal position of the sense strand. An iRNA agent may have an 
RRMS at an internal position of the sense strand; or may have an RRMS at an internal 
position of the antisense strand; or may have an RRMS at an internal position of the sense 
strand and an RRMS at an internal position of the antisense strand. 

In prefened embodiments the iRNA agent includes a &st and second sequences, 
which are preferably two separate molecules as opposed to two sequences located on the 
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same strand, have sufficient complementarity to each other to hybridize (and thereby form a 
duplex region), e.g., under physiological conditions, e.g., under physiological conditions but 
not in contact with a helicase or other unwinding enzyme. 

It is preferred that the first and second sequences be chosen such that the ds iRNA 

5 agent includes a single strand or unpaired region at one or both ends of the molecule. Thus, a 
ds iRNA agent contains first and second sequences, preferable paired to contain an overhang, 
e.g., one or two 5 ' or 3 ' overhangs but preferably a 3* overhang of 2-3 nucleotides. Most 
embodiments will have a 3' overhang. Preferred sRNA agents will have single-stranded 
overhangs, preferably 3' overhangs, of 1 or preferably 2 or 3 nucleotides in length at each 

10 end. The overhangs can be the result of one strand being longer than the other, or the result 
of two strands of the same length being staggered. 5' ends are preferably phosphorylated. 

An RNA agent, e.g., an iRNA agent, containing a preferred, but nonlimiting RRMS is 
presented as formula (R-2) in FIG. 4. The carrier includes two "backbone attachment points" 
(hydroxyl groups), a "tethering attachment point," and a ligand, which is connected indirectly 

15 to the carrier via an intervening tether. The RRMS may be the 5' or 3' terminal subunit of 
the RNA molecule, i.e., one of the two "W" groups may be a hydroxyl group, and the other 
"W" group may be a chain of two or more immodified or modified ribonucleotides. 
Alternatively, the RRMS may occupy an internal position, and both "W" groups may be one 
or more immodified or modified ribonucleotides. More than one RRMS may be present in a 

20 RNA molecule, e.g., an iRNA agent 

The modified RNA molecule of formula (R-2) can be obtained using oligonucleotide 
synthetic methods known in the art. In a preferred embodiment, the modified RNA molecule 
of formula (II) can be prepared by incorporating one or more of the corresponding RRMS 
monomer compounds (RRMS monomers, see, e.g.. A, B, and C in FIG 4) into a growing 

26 sense or antisense strand, utilizing, e.g., phosphoramidite or H-phosphonate coupling 
strategies. 

The RRMS monomers generally include two differently functionalized hydroxyl 
groups (OFG^ and OFG^ above), which are linked to the carrier molecule (see A in FIG 4), 
and a tethering attachment point. As used herein, the term "fimctionalized hydroxyl group" 
30 means that the hydroxyl proton has been replaced by another substituent* As shown in 

representative structures B and C, one hydroxyl group (OFG^) on the carrier is fimctionalized 
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with a protecting group (PG). The other hydroxyl group (OFG^) can be functionalized with 
either (1) a liquid or solid phase synthesis support reagent (solid circle) directly or indirectly 
through a linker, L, as in B, or (2) a phosphorus-containing moiety, e.g., a phosphoramidite as 
in C. The tethering attachment point may be coimected to a hydrogen atom, a tether, or a 
5 tethered ligand at the time that the monomer is incorporated into the growing sense or 
antisense strand (see R in Scheme 1). Thus, the tethered ligand can be, but need not be 
attached to the monomer at the time that the monomer is incorporated into the growing 
strand. In certain embodiments, the tether, the ligand or the tethered ligand may be linked to 
a "precxirsor" RRMS after a "precursor" RRMS monomer has been incorporated into the 
10 strand. 

The (OFG^) protecting group may be selected as desired, e.g., from T.W. Greene and 

P.G.M. Wuts, Protective Groups in Organic Synthesis^ 2d. Ed., John Wiley and Sons (1991). 

The protecting group is prefembly stable under amidite synthesis conditions, storage 

conditions, and oligonucleotide synthesis conditions. Hydroxyl groups, -OH, are 

15 nucleophilic groups (i.e., Lewis bases), which react through the oxygen with electrophiles 

(i.e., Lewis acids). Hydroxyl groups in which the hydrogen has been replaced with a 

protecting group, e.g., a triarylmethyl group or a trialkylsilyl group, are essentially unreactive 

as nucleophiles in displacement reactions. Thus, the protected hydroxyl group is useful in 

preventing e.g., homocoupling of compounds exemplified by structure C during 

20 oligonucleotide synthesis. A preferred protecting group is the dimethoxytrityl group. 

*\ 

When the OFG in B includes a linker, e.g., a long organic linker, connected to a 
soluble or insoluble support reagent, solution or solid phase synthesis techniques can be 
employed to build up a cham of natural and/or modified ribonucleotides once OFG^ is 
deprotected and fi:ee to react as a nucleophile with another nucleoside or monomer 

25 containing an electrophilic group (e.g., an amidite group). Alternatively, a natural or 

modified ribonucleotide or oligoribonucleotide chain can be coupled to monomer C via an 
amidite group or H-phosphonate group at OFG^. Subsequent to this operation, OFG^ can be 
deblocked, and the restored nucleophilic hydroxyl group can react with another nucleoside or 
monomer containing an electrophilic group (see FIG. 1). R' can be substituted or 

30 unsubstituted alkyl or alkenyl. In preferred embodiments, R' is methyl, ally] or 2- 
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cyanoethyl. R" may a C,-C,o alkyl group, preferably it is a branched group containing three 

or more carbons, e.g., isopropyl. 

OFG^ in B can be hydroxyl flmctionalized with a linker, which in turn contains a 
liquid or solid phase synthesis support reagent at the other linker terminus. The support 
reagent can be any support medium that can support the monomers described herein. The 
monomer can be attached to an insoluble support via a linker, L, which allows the monomer 
(and the growing chain) to be solubilized in the solvent in which the support is placed. The 
solubilized. yet immobilized, monomer can react with reagents in the surrounding solvent; 
unreacted reagents and soluble by-products can be readily washed away from the soUd 
support to which the monomer or monomer-derived products is attached. Alternatively, the 
monomer can be attached to a soluble support moiety, e.g.. polyethylene glycol (PEG) and 
Uquid phase synthesis techniques can be used to buUd up the chain. Linker and support 
medium selection is within skUl of the art. GeneraUy tiie linker may be -C(0)(CH2)qC(0)-. 
or -C(0)(CH2)qS-, preferably, it is oxalyl, succinyl or tiiioglycolyl. Standard control pore 
glass solid phase synthesis supports can not be used in conjunction with fluoride labile 5' 
silyl protecting groups because tiie glass is degraded by fluoride witti a significant reduction 
in tiie amount of full-lengtii product. Fluoride-stable polystyrene based supports or PEG are 
preferred. 

Prefened carriers have tiie general formula (R-3) provided below. (In tiiat structiire 
preferred backbone attachment points can be chosen from R' or R^ R^ or R"; or R' and R"> if 
Y is CR'r'° (two positions are chosen to give two backbone attachment points, e.g., R and 
K\ or R" and R*. Preferred tetiiering attachment points include R'; R' or R*^ when X is CH2. 
The carriers are described below as an entity, which can be incorporated into a stirand. Thus, 
it is understood tiiat tiie structiires also encompass tiie situations wherein one (in tiie case of a 
terminal position) or two (in tiie case of an intemal position) of tiie attachment points, e.g., R' 
or R^ R^ or R"; or R' or R'" (when Y is CRV^, is connected to tiie phosphate, or modified 
phosphate, e.g., sulfiir containing, backbone. E.g., one of tiie above-named R groups can be - 
CH2-, wherein one bond is connected to tiie carrier and one to a backbone atom, e.g., a 
linking oxygen or a central phosphorus atom.) 
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r2- 



R® 



R' 




R^ 



Y 



Z 



(R-3) 

s 

X is N(C0)R', NR' or CH2; Y is NR*, 0, S, CR^'"; and Z is CR"r'2 or absent. 

Each of R', R^ R^ R'*, R^ and R'" is, independently, H, OR", or (CH2)nOR^ provided 
that at least two of R', R^ R^ r", R', and R'° are OR" and/or (CH2)„0R^ 

Each of R^ R*, R", and R'^ is, independently, a Ugand, H, Ci-Ce alkyl optionaUy 
10 substituted wdth 1 -3 r'^ or C(0)NHR'; or R* and R" together are Ca-Cg cycloalkyl 
optionally substituted with R'^. 

R^ is H, a Ugand, or C1-C20 alkyl substituted withNR'^l''; R" is H or Ci-Ce alkyl; R^^ 
is hydroxy, C1-C4 alkoxy, or halo; R^* is mCK^; R}^ is Ci-Ce alkyl optionally substituted 
with cyano, or Ca-Cg alkenyl; R'^ is Cj-Cio allQrl; and R*' is a liquid or solid phase support 
15 reagent. 

L is -C(0)(CH2)qC(0)-, or -C(0)(CH2)qSs R* is CArj; R" is P(0)(0-)H, 
P(0R'^)N(R"')2 or L-R"; R' is H or Ci-Cfi alkyl; and R** is H or a ligand. 

Each Ar is, independently, Ce-Cio aryl optionally substituted with C1-C4 alkoxy; n is 
1-4; and q is 0-4. 

20 Exemplary carriers include those in which, e.g., X is N(CO)R^ or NR^, Y is CR^'°, 

and Z is absent; or X is N(CO)R' or NR^ Y is CR'r'°, and Z is CR' 'R'^ or X is N(C0)R' or 
NR', Y is NR*, and Z is CR"r'^ or X is N(CO)R^ or mJ, Y is O, and Z is CR' 'r'^ or X is 
CH2; Y is CR'r'°; Z is CR"R'^ and R' and R" together fonn Q cycloalkyl (H. z = 2), or 
the indane rmg system, e.g., X is CH2; Y is CR'r'"; Z is CR"R'^ and R^ and R'' together 

25 form C5 cycloalkyl (H, 2 = 1). 
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In certain embodiments, the carrier may be based on the pyrroline ring system or the 
3-hydroxyproline ring system, e.g., X is N(C0)R'' or NR^ Y is CrV^ and Z is absent (D). 
OFG^ is preferably attached to a primary carbon, e.g., an exocyclic alkylene 



5 group, e.g., a methylene group, connected to one of the carbons in the jave-membered ring (- 
CHzOFG^ in D). OFG^ is preferably attached directly to one of the carbons in the five- 
membered ring (-OFG^ in D). For the pyrroline-based carriers, -CH20FG^ may be attached 
to C-2 and OFG^ may be attached to C-3; or -CH20FG^ may be attached to C-3 and OFG^ 
may be attached to C-4. . In certain embodiments, CH2OFG' and OFG^ may be geminaUy 

10 substituted to one of the above-referenced carbons.For the 3-hydroxyproline-based carriers, - 
CHaOFG^ may be attached to C-2 and OFG^ may be attached to C-4. The pyrroline- and 3- 
hydroxyproline-based monomers may therefore contain linkages (e.g., carbon-carbon bonds) 
wherein bond rotation is restricted about that particular linkage, e.g. restriction resultmg from 
the presence of a ring. Thus, CH20FG^ and OFG^ may be cis or trans with respect to one 

15 another in any of the pairings delineated above Accordingly, all cis/trans isomers are 
expressly included. The monomers may also contain one or more asymmetric centers and 
thus occur as racemates and racemic mbctures, single enantiomers, individual diastereomers 
and diastereomeric mixtures. All such isomeric forms of the monomers are expressly 
included. The tethering attachment point is preferably nitrogen. 

20 In certain embodiments, the carrier may be based on the piperidine ring system (E), 

e.g., X is N(CO)R^ or NR^ Y is CR^R^°, and Z is CR^R^l OFG^ is preferably 



OFG^ 




C4- -C3 CH2OFG 



LIGAND 



D 
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LIGAND 
£ 

attached to a primary carbon, e.g., an exocyclic alkylene group, e.g., a methylene group (n=l) 
or ethylene group (n=2), connected to one of the carbons in the six-membered ring [- 
5 (CH2)nOFG^ in E]. OFG^ is preferably attached directly to one of the carbons in the six- 
membered ring (-OFG^ in E). -(CH2)nOFG^ and OFG^ may be disposed in a geminal manner 
on the ring, i.e., both groups may be attached to the same carbon, e.g., at C-2, C-3, or C-4. 
Alternatively, -(CH2)nOFG^ and OFG^ may be disposed in a vicinal manner on the ring, i.e., 
both groups may be attached to adjacent rmg carbon atoms, e.g., -(CH2)nOFG^ may be 

10 attached to C-2 and OFG^ may be attached to C-3; -(CH2)n'0FG' may be attached to C-3 and 
OFG^ may be attached to C-2; -(CH2)nOFG* may be attached to C-3 and OFG^ may be 
attached to C-4; or -(CH2)nOFG^ may be attached to C-4 and OFG^ may be attached to C-3. 
The piperidine-based monomers may therefore contain linkages (e.g., carbon-carbon bonds) 
wherein bond rotation is restricted about that particular linkage, e.g. restriction resulting from 

15 the presence of a ring. Thus, -(CH2)nOFG^ and OFG^ may be cis or trans with respect to one 
another in any of the pairings delineated above. Accordingly, all cis/trans isomers are 
expressly included. The monomers may also contain one or more asymmetric centers and 
thus occur as racemates and racemic mixtures, single enantiomers, individual diastereomers 
and diastereomeric mixtures. All such isomeric forms of the monomers are expressly 

20 included. The tethering attachment point is preferably nitrogen. 
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In certain embodiments, the earner may be based on the piperazine ring system (F), 
e.g., X is N(CO)R^ or NR^ Y is ]SIR^ and Z is CR^^R*^ or the moipholine ring system (G), 
e.g., X is N(CO)R^ or NR^ Y is O, and Z is CR^^R^l OFG^ is preferably 



attached to a primary carbon, e.g., an exocyclic alkylene group, e.g., a methylene group, 
connected to one of the carbons in the six-membered ring (-CH2OFG* in F or G), OFG^ is 
preferably attached directly to one of the carbons in the six-membered rings (-OFG^ in F or 
G). For both F and G, -CHaOFG^ may be attached to C-2 and OFG^ may be attached to C-3; 
or vice versa. In certain embodiments, CH20FG^ and OFG^ may be gemiiially substituted to 
one of the above-referenced carbons.The piperazine- and moipholine-based monomers may 
therefore contain linkages (e.g., carbon-carbon bonds) wherein bond rotation is restricted 
about that particular linkage, e.g. restriction resulting from the presence of a ring. Thus, 
CH20FG^ and OFG^ may be cis or tram with respect to one another in any of the pairings 
delineated above. Accordingly, all cis/trans isomers are expressly included. The monomers 
may also contain one or more asymmetric centers and thus occur as racemates and racemic 
mixtures, single enantiomers, individual diastereomers and diastereomeric mixtures. All 
such isomeric forms of the monomers are expressly included. R'" can be, e.g., Cj-Ce alkyl, 
preferably CH3. The tethering attachment point is preferably nitrogen in both F and G. 




LIGAND 



LIGAND 



G 
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In certain embodiments, the carrier may be based on the decalin ring system, e.g., X 
is CH2; Y is CRV; Z is CR"R*^ and and R" together form Cg cycloalkyi (H, z = 2), or 
the indane ring system, e.g., X is CHj; Y is CR^R^"; Z is CR"r'', and R^ and R" together 
form Cs cycloalkyi (H, z = 1). OFG' is preferably attached to a primary carbon. 




OFG^ 

«S I -p-(CH2)nOFG^ 
H 

e.g., an exocyclic methylene group (n=l) or ethylene group (n=2) connected to one of C-2, 
C-3, C-4, or C-5 [-(CH2)nOFG^ in H]. OFG^ is preferably attached directly to one of C-2, C- 
3, C-4, or C-5 (-OFG^ in H). -(CH2)nOFG' and OFG^ may be disposed in a geimnal manner 

10 on the ring, i.e., both groups may be attached to the same carbon, e.g., at C-2, C-3, C-4, or C- 
5. Alternatively, -(CH2)nOFG^ and OFG^ may be disposed in a vicinal manner on the ring, 
i.e., both groups may be attached to adjacent ring carbon atoms, e.g., -(CH2)nOFG^ may be 
attached to C-2 and OFG^ may be attached to C-3; -(CH2)iiOFG^ may be attached to C-3 and 
OFG^ may be attached to C-2; -(CH2)nOFG^ may be attached to C-3 and OFG^ may be 

1 5 attached to C-4; or -(CH2)nOFG^ may be attached to C-4 and OFG^ may be attached to C-3; - 
(CH2)nOFG' may be attached to C-4 and OFG^ may be attached to C-5; or -(CH2)nOFG^ may 
be attached to C-5 and OFG^ may be attached to C-4. The decalin or indane-based 
monomers may therefore contain linkages (e.g., carbon-carbon bonds) wherein bond rotation 
is restricted about that particular linkage, e.g. restriction resulting from the presence of a ring. 

20 Thus, -(CH2)nOFG^ and OFG^ may be cis or trans vsdth respect to one another in any of the 
pairings delineated above. Accordingly, all cis/trans isomers are expressly included. The 
monomers may also contain one or more asymmetric centers and thus occur as racemates and 
racemic mixtures, single enantiomers, individual diastereomers and diastereomeric mixtures. 
All such isomeric forms of the monomers are expressly included. In a prefen ed 

25 embodiment, the substituents at C-1 and C-6 are trans with respect to one another. The 
tethering attachment point is preferably C-6 or C-7. 
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Other carriers may include those based on 3-hydroxyproline (J). Thus, -(CH2)nOFG^ and 
OFG^ may be cis or trans with respect to one another. Accordingly, all cis/trans isomers are 
expressly included. The monomers may also contain one or more asymmetric centers 




LIGAND 
J 

5 

and thus occur as racemates and racemic mixtures, single enantiomers, individual 
diastereomers and diastereomeric mixtures. All such isomeric forms of the monomers are 
expressly included. The tethering attachment point is preferably nitrogen. 
Representative carriers are shown in FIG. 5. 

10 In certain embodiments, a moiety, e.g., a ligand may be connected indirectly to the 

carrier via the mtermediacy of an intervening tether. Tethers are connected to the carrier at 
the tethering attachment point (TAP) and may include any Ci-Cioo carbon-containing moiety, 
(e.g. C1-C75, C1-C50, C1-C20, Ci-Cio, Ci-Ce), preferably having at least one nitrogen atom. In 
preferred embodiments, the nitrogen atom forms part of a terminal amino group on the tether, 

15 which may serve as a connection point for the ligand. Preferred tethers (underlined) include 
TAP -fCH^)nNH^ ; TAP- C(0¥CH7\ NH9; or TAP -NR"'YCH7)nNH9, in which n is 1-6 and 
R'^'* is CrC6 alkyl. and R^ is hydrogen or a ligand. In other embodiments, the nitrogen may 
form part of a terminal oxyamino group, e.g., -ONH2, or hydrazine group, -NHNH2. The 
tether may optionally be substituted, e.g., with hydroxy, alkoxy, perhaloalkyl, and/or 

20 optionally inserted with one or more additional heteroatoms, e.g., N, O, or S. Preferred 

tethered ligands may mclude, e.g., TAP -fCHo^pNHfLIGAND). 

TAP -CfOVCH^^nNHfLIGAND), or TAP ^NR' ' ' ' fCH. yNHfLIGAND) ; 

TAP -fCH,\ONHfLIGAND\ TAP -CfO¥CH^VONHfLIGAND\ or 

TAP -MI^ ' ' ' (CH9\ONHfLIGAND): TAP-fCH^^NHNHofLIGANDX 

25 TAP- CfOVCH?VNHNH2fLIGAND^, or T AP-NR> ' ' YCHy^NHNH^fLIGAND). 
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In other embodiments the tether may include an electrophilic moiety, preferably at the 
terminal position of the tether. Preferred electrophilic moieties include, e.g., an aldehyde, 
alkyl halide, mesylate, tosylate, nosylate, or brosylate, or an activated carboxylic acid ester, 
e.g. an NHS ester, or a pentafluorophenyl ester. Preferred tethers (underlined) include TAP: 

5 fCH2 \CH0; TAP- CfOirCHoln CHO: or TAP- NR"'YCH? , ^^CHO. in which n is 1-6 and R"" 
is Cj-Cfi alkyl; or TA P-fCH^^CfO^ONHS: TAP- CCOVCHyljCfO^ONHS : or 
TAP -NR' ' ''(CH?) pCf Q'tONHS. in which n is 1-6 and R"" is Ci-Ce alkyl; 
TAP-i:CH2}n C(0)0C ^; TAP -aQ)(CH,J^C(01 OC^ ;: or TAP- NR" "(CHrtjCfO^ OC^^ , 
in which n is 1-6 and R"" is Ci-Ce alkyl; or -(CHo\Clh^G: TAP-C£0)£CH2)bCH2LG; or 

10 TAP -NR' ' ' • (CH^ t VCHoLG. in which n is 1-6 and R"" is Ci-Ce alkyl (LG can be a leaving 
group, e.g., halide, mesylate, tosylate, nosylate, brosylate). Tethering can be carried out by 
coupling a nucleophilic group of a ligand, e.g., a thiol or amino group with an electrophilic 

f 

group on the tether. 

16 Tethered Entities 

A wide variety of entities can be tethered to an iRNA agent, e.g., to the carrier of an 
RRMS. Examples are described below in the context of an RRMS but that is only preferred, 
entities can be coupled at otiier points to an iRNA agent. 

Preferred moieties are ligands, which are coupled, preferably covalently, either 

20 directly or indirectly via an intervening tether, to the RRMS carrier. In preferred 

embodiments, the ligand is attached to the carrier via an intervening tether. As discussed 
above, the ligand or tethered ligand may be present on the RRMS monomer when the RRMS 
monomer is incoiporated into the growing strand. In some embodiments, the ligand may be 
incorporated into a "precursor" RRMS after a "precursor" RRMS monomer has been 

26 incorporated mto the growing strand. For example, an RRMS monomer havmg, e.g., an 
amino-terminated tether (i.e., having no associated ligand), e.g., TAP-(CH2)nNH2 may be 
incorporated into a growing sense or antisense strand. In a subsequent operation, i.e., after 
incorporation of the precursor monomer into the strand, a ligand having an electrophilic 
group, e.g., a pentafluorophenyl ester or aldehyde group, can subsequently be attached to the 

30 precursor RRMS by coupling the electrophilic group of the ligand with the terminal 
nucleophilic group of the precursor RRMS tether. 
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In preferred embodiments, a ligand alters the distribution, targeting or lifetime of an 
iRNA agent into which it is incorporated. In preferred embodiments a ligand provides an 
enhanced affinity for a selected target, e.g, molecule, cell or cell type, compartment, e.g., a 
cellular or organ compartment, tissue, organ or region of the body, as, e.g., compared to a 
5 species absent such a ligand. Preferred ligands will not take part in duplex pairing in a 
duplexed nucleic acid. 

Preferred ligands can improve transport, hybridization, and specificity properties and 
may also improve nuclease resistance of the resultant natural or modified 
oligoribonucleotide, or a polymeric molecule comprising any combination of monomers 

10 described herein and/or natural or modified ribonucleotides. 

Ligands in general can include therapeutic modifiers, e.g., for enhancing uptake; 
diagnostic compounds or reporter groups e.g., for monitoring distribution; cross-linking 
agents; and nuclease-resistance conferring moieties. General examples include lipids, 
steroids, vitamins, sugars, proteins, peptides, polyamines, and peptide mimics. 

15 Ligands can include a naturally occurring substance, such as a protein (e.g., himian 

serum albumin (HSA), low-density lipoprotein (LDL), or globulin); carbohydrate (e.g., a 
dextran, pullulan, chitin, chitosan, inulin, cyclodextrin or hyaluronic acid); or a lipid. The 
ligand may also be a recombinant or synthetic molecule, such as a synthetic polymer, e.g., a 
synthetic polyamino acid. Examples of polyamino acids include polyamino acid is a 

20 polylysine (PLL), poly L-aspartic acid, poly L-glutamic acid, styrene-maleic acid anhydride 
copolymer, poly(L-lactide-co-glycolied) copolymer, divinyl ether-maleic anhydride 
copolymer, N-(2-hydroxypropyl)methacrylamide copolymer (HMPA), polyethylene glycol 
(PEG), polyvinyl alcohol (PVA), polyurethane, poly(2-ethylacryllic acid), N- 
isopropylacrylamide polymers, or polyphosphazine. Example of polyamines include: 

25 polyethylenimine, polylysine (PLL), spermine, spermidine, polyamine, pseudopeptide- 

polyamine, peptidomimetic polyamine, dendrimer polyamine, arginine, amidine, protamine, 
cationic lipid, cationic porphyrin, quaternary salt of a polyamine, or an alpha helical peptide. 

Ligands can also include targeting groups, e.g., a cell or tissue targetuig agent, e.g., a 
lectin, glycoprotein, lipid or protein, e.g., an antibody, that binds to a specified cell type such 

30 as a cancer cell, endothelial cell, bone cell. A targeting group can be a thyrotropin, 

melanotropm, lectin, glycoprotem, surfactant protem A, Mucin carbohydrate, multivalent 
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lactose, multivalent galactose, N-acetyl-galactosamine, N-acetyl-gulucosamine multivalent 
mannose, multivalent fucose, glycosylated polyaminoacids, multivalent galactose, 
transferrin, bisphosphonate, polyglutamate, polyaspartate, a lipid, cholesterol, a steroid, bile 
acid, folate, vitamin B 12, biotin, or an RGD peptide or RGD peptide mimetic. 

5 Other examples of ligands include dyes, intercalating agents (e.g. acridmes), cross- 

linkers {e.g, psoralene, mitomycin C), porphyrins (TPPC4, texaphyrin, Sapphyrin), 
polycyclic aromatic hydrocarbons (e.g., phenazine, dihydrophenazine), artificial 
endonucleases (e.g. EDTA), lipophilic molecules, e.g, cholesterol, cholic acid, adamantane 
acetic acid, 1-pyrene butyric acid, dihydrotestosterone, l,3-Bis-0(hexadecyl)glycerol, 

10 geranyloxyhexyl group, hexadecylglycerol, bomeol, menthol, 1,3-propanediol, heptadecyl 
group, pahnitic acid, myristic acid,03-(oleoyl)lithocholic acid, 03-(oleoyl)cholenic acid, 
dimethoxytrityl, or phenoxazine)and peptide conjugates (e.g., antennapedia peptide, Tat 
peptide), alkylating agents, phosphate, amino, mercapto, PEG (e.g., PEG-40K), MPEG, 
[MPEG]2, polyamino, alkyl, substituted alkyl, radiolabeled markers, enzymes, haptens (e.g. 

15 biotin), transport/absorption facilitators (e.g., aspirin, vitamin E, folic acid), synthetic 
ribonucleases (e.g., imidazole, bisimidazole, histamine, imidazole clusters, acridine- 
unidazole conjugates, Eu3+ complexes of tetraazamacrocycles), dinitrophenyl, HRP, or AP. 

Ligands can be proteins, e.g., glycoproteins, or peptides, e.g., molecules having a 
specific affinity for a co-ligand, or antibodies e.g., an antibody, that binds to a specified cell 

20 type such as a cancer cell, endotheUal cell, or bone cell. Ligands may also include hormones 
and hormone receptors. They can also include non-peptidic species, such as lipids, lectins,, 
carbohydrates, vitamins, cofactors, multivalent lactose, multivalent galactose, N-acetyl- 
galactosamine, N-acetyl-gulucosamine multivalent mannose, or multivalent fucose. The 
ligand can be, for example, a lipopolysaccharide, an activator of p38 MAP kinase, or an 

25 activator of NF-kB. 

The ligand can be a substance, e.g, a drug, which can increase the uptake of the iRNA 
agent into the cell, for example, by disrupting the cell's cytoskeleton, e.g., by disrupting the 
cell's microtubules, microfilaments, and/or intermediate filaments. The drug can be, for 
example, taxon, vincristine, vinblastine, cytochalasin, nocodazole, japlakinolide, latrunculin 

30 A, phalloidin, swinholide A, indanocine, or myoservin. 
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The ligand can increase the uptake of the iRNA agent into the cell by activating an 
inflammatory response, for example. Exemplary ligands that would have such an effect 
include tumor necrosis factor alpha (TNFalpha), interleukin-1 beta, or gamma interferon. 

In one aspect, the ligand is a lipid or lipid-based molecule. Such a lipid or lipid- 
5 based molecule preferably binds a serum protein, e.g., human serum albumin (HSA). An 
HSA binding ligand allows for distribution of the conjugate to a target tissue, e.g., a non- 
kidney target tissue of the body. Preferably, the target tissue is the liver, preferably 
parenchymal cells of the liver. Other molecules that can bind HSA can also be used as 
ligands. For example, neproxin or aspirin can be used. A lipid or lipid-based ligand can (a) 
10 increase resistance to degradation of the conjugate, (b) increase targeting or transport into a 
target cell or cell membrane, and/or (c) can be used to adjust binding to a seru protein, e.g., 
HSA. 

A lipid based ligand can be used to modulate, e.g., control the binding of the 
conjugate to a target tissue. For example, a lipid or lipid-based ligand that binds to HSA 
15 more strongly wiU be less likely to be targeted to the kidney and therefore less likely to be 
cleared from the body. A lipid or lipid-based ligand that binds to HSA less strongly can be 
used to target the conjugate to the kidney. 

In a preferred embodiment, the lipid based ligand binds HSA. Preferably, it binds 
HSA with a sufficient affinity such that the conjugate will be preferably distributed to a non- 
20 kidney tissue. However, it is preferred that the affinity not be so strong that the HSA-ligand 
binding cannot be reversed. 

In another prefened embodiment, the lipid based ligand binds HSA weakly or not at 
all, such that the conjugate will be preferably distributed to the kidney. Other moieties that 
target to kidney cells can also be used in place of or in addition to the lipid based ligand. 
25 In another aspect, the ligand is a moiety, e.g., a vitamin, which is taken up by a target 

cell, e.g., a proliferating cell These are particularly useful for treating disorders 
characterized by unwanted cell proliferation, e.g., of the malignant or non-malignant type, 
e.g., cancer cells. Exemplary vitamins include vitamin A, E, and K. Other exemplary 
vitamins include are B vitamin, e.g., folic acid, B12, riboflavin, biotin, pyridoxal or other 
30 vitamins or nutrients taken up by cancer cells. Also included are HSA and low density 
lipoprotein (LDL). 
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In another aspect, the ligand is a cell-permeation agents preferably a helical cell- 
permeation agent Preferably, the agent is amphipathic. An exemplary agent is a peptide 
such as tat or antennopedia. If the agent is a peptide, it can be modified, including a 
peptidylmimetic, invertomers, non-peptide or pseudo-peptide linkages, and use of D-amino 
5 acids. The helical agent is preferably an alpha-helical agent, which preferably has a 
lipophilic and a lipophobic phase. 

The ligand can be a peptide or peptidomimetic. A peptidomimetic (also referred to 
herein as an oligopeptidomimetic) is a molecule capable of folding into a defined three- 
dimensional structure similar to a natural peptide. The attachment of peptide and 
10 peptidomimetics to iRNA agents can affect pharmacokinetic distribution of the iRNA, such 
as by enhancing cellular recognition and absorption. The peptide or peptidomimetic moiety 
can be about 5-50 amino acids long, e.g., about 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 amino 
acids long (see Table 1, for example). 
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Table 1 . Exemplary Cell Penneation Peptides 



CeU 

Permeation 

Peptide 


Amino acid Sequence 


Reference 


Penetratin 


RQIKIWFQNMIMKWKK (SEQ ID NO:6737) 


Derossi et al., J. Biol. 
Chem. 269:10444, 
1994 


Tat fragment 
(48-60) 


GRKKRRQRRRPPQC (SEQ ID NO:6738) 


Vives et ah, J. Biol. 
Chem., 272:16010, 
1997 


Signal 
Seauence- 
based peptide 


GALFLGWLGAAGSTMGAWSQPKKKRKV 
fSEO ID NO:6738') 


Chaloin et ai^ 
Biochem. BioDhvs. 
Res. Commun., 
243:601, 1998 


PVEC 


LLflLRRRIRKQAHAHSK (SEQ ID NO: 673 9) 


Elmquist et al, Exp. 
Cell Res., 269:237, 

2001 


Transportan 


GWTLNSAGYLLKINLKALAALAKKEL 
(SEQ ID NO:6740) 


Pooga o/., FASEB 
J., 12:67, 1998 


Amnhiohilic 
model peptide 


KLALKLALKALKAALKLA fSEOED 
NO:6741) 


Oehlke et al Mol 

Ther., 2:339, 2000 




RRKRRRRRR TSEO ID NO'6742'^ 


Mitchell et al T 
Pept.Res., 56:318, 
2000 

St* w w \/ 


Bacterial cell 
wall 

permeating 


KFFKFFKFFK (SEQ ID NO: 6743) 

• 




LL-37 


LLGDFFRKSKEKIGKEFKRIVORIKDFLRN 
LVPRTES (SEQ ID NO: 6744) 




CecropinPl 


SWLSKTAKKLENSAKKRISEGIAIAIQGGP 
R (SEQ ID NO:6745) 




a-defensin 


ACYCRIPACIAGERRYGTCIYQGRLWAFC 

C (SEQIDNO:6746) 




b-defensin 


DHYNCVSSGGQCLYSACPIFTKIQGTCYR 

GKAKCCK (SEQ ID NO:6747) 




Bactenecin 


RKCRIWIRVCR (SEQ ID NO:6748) 




PR-39 


RRRPRPPYLPRPRPPPFFPPRLPPRIPPGFPP 
RFPPRFPGKR-NH2 (SEQ ID NO:6749) 




Indolioidin 


ILPWKWPWWPWRR-NH2 (SEQ ID 
NO:6750) 





A peptide or peptidomimetic can be, for example, a cell penneation peptide, cationic 
peptide, amphipathic peptide, or hydrophobic peptide (e.g., consisting primarily of Tyr, Trp 
5 or Phe). The peptide moiety can be a dendrimer peptide, constrained peptide or crosslinked 
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peptide. In another alternative, the peptide moiety can include a hydrophobic membrane 
translocation sequence (MTS). An exemplary hydrophobic MTS-containing peptide is 
RFGF havmg the amino acid sequence AAVALLPAVLLALLAP (SEQ ID NO:675 1). An 
RFGF analogue {e.g., amino acid sequence AALLPVLLAAP (SEQ ID NO:6752)) 
containing a hydrophobic MTS can also be a targeting moiety. The peptide moiety can be a 
"delivery" peptide, which can carry large polar molecules including peptides, 
oligonucleotides, and protein across cell membmnes. For example, sequences from the HIV 
Tat protein (GRKKRRQRRRPPQ (SEQ ID NO:6753)) and the Drosophila Antemiapedia 
protein (RQIKIWQNRRMKWKK (SEQ ID NO:6754)) have been found to be capable of 
functioning as delivery peptides. A peptide or peptidomimetic can be encoded by a random 
sequence of DNA, such as a peptide identified from a phage-display library, or one-bead- 
one-compound (OBOC) combinatorial library (Lam et a/., Nature, 354:82-84, 1991). 
Preferably the peptide or peptidomimetic tethered to an iRNA agent via an incorporated 
monomer unit is a cell targeting peptide such as an arginine-glycine-aspartic acid (RGD)- 
peptide, or RGD mimic. A peptide moiety can range in length from about 5 amino acids to 
about 40 amino acids. The peptide moieties can have a structural modification, such as to 
increase stability or direct conformational properties. Any of the structural modifications 
described below can be utilized. 

An RGD peptide moiety can be used to target a tumor cell, such as an endothelial 
tumor cell or a breast cancer tumor cell (Zitzmann et aL, Cancer Res., 62:5139-43, 2002). 
An RGD peptide can facilitate targeting of an iRNA agent to tumors of a variety of other 
tissues, including the lung, kidney, spleen, or liver (Aoki et al. Cancer Gene Therapy 8:783- 
787, 2001). The RGD peptide can be linear or cyclic, and can be modified, e.g., glycosylated 
or methylated to facilitate targeting to specific tissues. For example, a glycosylated RGD 
peptide can deliver an iRNA agent to a tumor cell expressmg ayBs (Haubner et al. Jour. 
Nucl. Med., 42:326-336, 2001). 

Peptides that target markers enriched in proliferating cells can be used. £.g., RGD 
containing peptides and peptidomimetics can target cancer cells, in particular cells that 
exhibit an OyPa integrin. Thus, one could use RGD peptides, cycUc peptides containing 
RGD, RGD peptides that include D-amino acids, as well as synthetic RGD mimics. In 
addition to RGD, one can use other moieties that target the Oy-Ps integrin ligand. Generally, 
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such ligands can be used to control proliferating cells and angiogeneis. Preferred conjugates 
of this type include an iRNA agent that targets PECAM-1, VEGF, or other cancer gene, e.g., 
a cancer gene described herein. 

A "cell permeation peptide" is capable of permeating a cell, e.g., a microbial cell, 
such as a bacterial or fungal cell, or a mammalian cell, such as a human cell. A microbial 
cell-permeating peptide can be, for example, an a-helical linear peptide (e.g., LL-37 or 
Ceropin PI), a disulfide bond-containing peptide (e.g., a -defensin, p-defensin or bactenecin), 
or a peptide containing only one or two dominating amino acids (e.g., PR-39 or indolicidin). 
A cell permeation peptide can also include a nuclear localization signal (NLS). For example, 
a cell permeation peptide can be a bipartite amphipathic peptide, such as MPG, which is 
derived from the fusion peptide domain of HIV- 1 gp41 and the NLS of SV40 large T antigen 
(Simeoni et al, Nucl. Acids Res. 31 :2717-2724, 2003). 

In one embodiment, a targeting peptide tethered to an RRMS can be an amphipathic 
a-helical peptide. Exemplary amphipathic a-helical peptides mclude, but are not limited to, 
cecropins, lycotoxins, paradaxins, buforin, CPF, bombinin-like peptide (BLP), cathelicidins, 
ceratotoxins, S, clava peptides, hagfish mtestinal antimicrobial peptides (HFIAPs), 
magainines, brevinins-2, dennaseptins, melittms, pleurocidin, H2A peptides, Xenopus 
peptides, esculentmis-1, and caerins. A number of factors will preferably be considered to 
maintain the integrity of helix stability. For example, a maximum number of helix 
stabilization residues will be utilized (e.g., leu, ala, or lys), and a minimum number helix 
destabilization residues will be utilized (e.g., proline, or cyclic monomeric units. The 
cappmg residue will be considered (for example Gly is an exemplary N-capping residue 
and/or C-terminal amidation can be used to provide an extra H-bond to stabilize the helix. 
Formation of salt bridges between residues v^th opposite charges, separated by i d: 3, or i ± 4 
positions can provide stability. For example, cationic residues such as lysuie, argmme, 
homo-arginine, ornithine or histidine can form salt bridges with the anionic residues 
glutamate or aspartate. 

Peptide and petidomimetic ligands include those having naturally occurring or 
modified peptides, e.g., D or L peptides; a, p, or y peptides; N-methyl peptides; azapeptides; 
peptides having one or more amide, i.e., peptide, Unkages replaced with one or more urea, 
thiourea, carbamate, or sulfonyl urea linkages; or cyclic peptides. 
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Methods for making iRNA agents 

iRNA agents can include modified or non-naturally occuring bases, e.g., bases 
described in copending and coowned United States Provisional Application Serial No. 

5 60/463,772 (Attorney Docket No. 14174-070P01), filed on April 17, 2003, which is hereby 
incorporated by reference and/or in copending and coowned United States Provisional 
Application Serial No. 60/465,802 (Attorney Docket No. 14174-074P01), filed on April 25, 
2003, which is hereby incorporated by reference. Monomers and iRNA agents which include 
such bases can be made by the methods found in United States Provisional Application Serial 

10 No. 60/463,772 (Attorney Docket No. 14174-070P01), filed on April 17, 2003, and/or in 
United States Provisional Application Serial No. 60/465,802 (Attorney Docket No. 14174- 
074P01), filed on April 25, 2003. 

In addition, the invention includes iRNA agents having a modified or non-naturally 
occuring base and another element described herein. E.g., the mvention includes an iRNA 

15 agent described herein, e.g., a palindromic iRNA agent, an iRNA agent having a non 

canonical pairing, an iRNA agent which targets a gene described herem, e.g., a gene active in 
the liver, an iRNA agent having an architecture or structure described herein, an iRNA 
associated with an amphipathic delivery agent described herein, an iRNA associated with a 
drug delivery module described herein, an iRNA agent administered as described herein, or 

20 an iRNA agent formulated as described herein, which also incorporates a modified or non- 
naturally occuring base. 

The synthesis and purification of oligonucleotide peptide conjugates can be 
performed by established methods. See, for example, Trufert et aL, Tetrahedron, 52:3005, 
1996; and Manoharan, "Oligonucleotide Conjugates in Antisense Technology," in Antisense 

25 Drug Technology, ed. S.T. Crooke, Marcel Dekker, Inc., 2001 . 

In one embodiment of the invention, a peptidomimetic can be modified to create a 
constrained peptide that adopts a distinct and specific preferred conformation, which can 
increase the potency and selectivity of the peptide. For example, the constrained peptide can 
be an azapeptide (Gante, Synthesis, 405-413, 1989). An azapeptide is synthesized by 

30 replacing the a-carbon of an amino acid with a nitrogen atom without changing the structure 
of the amino acid side chain. For example, the azapeptide can be synthesized by using 
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hydrazine in traditional peptide synthesis coupling methods, such as by reacting hydrazine 
with a "carbonyl donor," e.g., phenylchlorofonnate. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be an N-methyl peptide. N-methyl peptides are 
5 composed of N-methyl amino acids, which provide an additional methyl group in the 

peptide backbone, thereby potentially providing additional means of resistance to proteolytic 
cleavage. N-methyl peptides can by synthesized by methods known in the art (see, for 
example, Lindgren et aL^ Trends Pharmacol. Sci. 21 :99, 2000; Cell Penetrating Peptides: 
Processes and Applications, Langel, ed., CRC Press, Boca Raton, FL, 2002; Fische et al^ 

10 Bioconjugate. Chem. 12: 825, 2001; Wander et al, J. Am. Chem. Soc, 124:13382, 2002). 
For example, an Ant or Tat peptide can be an N-methyl peptide. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be a P-peptide. P-peptides form stable secondary 
structures such as helices, pleated sheets, turns and hairpins in solutions. Their cyclic 

15 derivatives can fold into nanotubes in the solid state. P-peptides are resistant to degradation 
by proteolytic enzymes. P-peptides can be synthesized by methods known in the art. For 
example, an Ant or Tat peptide can be a p-peptide. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be a oligocarbamate. Oligocarbamate peptides are 

20 internalized into a cell by a transport pathway facilitated by carbamate transporters. For 
example, an Ant or Tat peptide can be an oligocarbamate. 

In one embodiment of the invention, a peptide or peptidomimetic (e.g., a peptide or 
peptidomimetic tethered to an RRMS) can be an oligourea conjugate (or an oligothiourea 
conjugate), in which the amide bond of a peptidomimetic is replaced with a urea moiety. 

26 Replacement of the amide bond provides increased resistance to degradation by proteolytic 
enzymes, e.g., proteolytic enzymes in the gastrointestinal tract. In one embodiment, an 
oligourea conjugate is tethered to an iRNA agent for use in oral delivery. The backbone in 
each repeating imit of an oligourea peptidomimetic can be extended by one carbon atom in 
comparison with the natural amino acid. The single carbon atom extension can increase 

30 peptide stability and lipophilicity, for example. An oligourea peptide can therefore be 

advantageous when an iRNA agent is directed for passage through a bacterial cell wall, or 
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when an iRNA agent must traverse the blood-brain barrier, such as for the treatment of a 
neurological disorder. In one embodiment, a hydrogen bonding unit is conjugated to the 
oligourea peptide, such as to create an increased affinity with a receptor. For example, an 
Ant or Tat peptide can be an oligourea conjugate (or an oligothiourea conjugate). 

The siRNA peptide conjugates of the invention can be affiliated with, e.g., tethered 
to, RRMSs occurring at various positions on an iRNA agent. For example, a peptide can be 
terminally conjugated, on either the sense or the antisense strand, or a peptide can be 
bisconjugated (one peptide tethered to each end, one conjugated to the sense strand, and one 
conjugated to the antisense strand). In another option, the peptide can be internally 
conjugated, such as in the loop of a short hairpin iRNA agent. In yet another option, the 
peptide can be affiliated with a complex, such as a peptide-carrier complex. 

A peptide-carrier complex consists of at least a carrier molecule, which can 
encapsulate one or more iRNA agents (such as for delivery to a biological system and/or a 
cell), and a peptide moiety tethered to the outside of the carrier molecule, such as for 
targeting the carrier complex to a particular tissue or cell type. A carrier complex can carry 
additional targeting molecules on the exterior of the complex, or fusogenic agents to aid in 
cell delivery. The one or more iRNA agents encapsulated within the carrier can be 
conjugated to lipophilic molecules, which can aid in the delivery of the agents to the interior 
of the carrier, 

A carrier molecule or structure can be, for example, a micelle, a liposome (e.g., a 
cationic liposome), a nanoparticle, a microsphere, or a biodegradable polymer. A peptide 
moiety can be tethered to the carrier molecule by a variety of linkages, such as a disulfide 
linkage, an acid labile linkage, a peptide-based linkage, an oxyamino linkage or a hydrazine 
linkage. For example, a peptide-based linkage can be a GFLG peptide. Certain linkages will 
have particular advantages, and the advantages (or disadvantages) can be considered 
depending on the tissue target or intended use. For example, peptide based linkages are 
stable in the blood stream but are susceptible to en2ymatic cleavage in the lysosomes. 

Targeting 

The iRNA agents of the invention are particularly useful when targeted to the liver. 
An iRNA agent can be targeted to the liver by incorporation of an RRMS containing a ligand 
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that targets the liver. For example, a liver-targeting agent can be a lipophilic moiety. 
Preferred lipophilic moieties include lipid, cholesterols, oleyl, retinyl, or cholesteryl residues. 
Other lipophilic moieties that can function as liver-targeting agents include cholic acid, 
adamantane acetic acid, 1 -pyrene butyric acid, dihydrotestosterone, 1 ,3-Bis- 
5 0(hexadecyl)glycerol, geranyloxyhexyl group, hexadecylglycerol, bomeol, menthol, 1,3- 
propanediol, heptadecyl group, palmitic acid, myristic acid,03-(oleoyl)lithocholic acid, 03- 
(oleoyl)cholenic acid, dimethoxytrityl, or phenoxazine. 

An iRNA agent can also be targeted to the liver by association with a low-density 
lipoprotein (LDL), such as lactosylated LDL. Polymeric carriers complexed with sugar 

10 residues can also function to target iKNA agents to the liver. 

A targeting agent that incorporates a sugar, e.g., galactose and/or analogues thereof, is 
particularly useful. These agents target, in particular, the parenchymal cells of the liver. For 
example, a targeting moiety can include more than one or preferably two or three galactose 
moieties, spaced about 15 angstroms from each other. The targeting moiety can alternatively 

15 be lactose (e.g., three lactose moieties), which is glucose coupled to a galactose. The 
targeting moiety can also be N-Acetyl-Galactosamine, N-Ac-Glucosamine. A mannose or 
mannose-6-phosphate targeting moiety can be used for macrophage targeting. 

Conjugation of an iRNA agent with a serum albumin (SA), such as human serum 
albumin, can also be used to target the iRNA agent to the liver. 

20 An iRNA agent targeted to the liver by an RRMS targeting moiety described herein 

can target a gene expressed in the liver. For example, the iRNA agent can target 
p21(WAFl/DIPl), P27(KIP1), the a-fetoprotein gene, beta-catenin, or c-MET, such as for 
treating a cancer of the liver. In another embodiment, the iRNA agent can target apoB-100, 
such as for the treatment of an HDL/LDL cholesterol imbalance; dyslipidemias, e.g., familial 

25 combined hyperlipidemia (FCHL), or acquired hyperlipidemia; hypercholesterolemia; statin- 
resistant hypercholesterolemia; coronary artery disease (CAD); coronary heart disease 
(CHD); or atherosclerosis. In another embodiment, the iRNA agent can target forkhead 
homologue in rhabdomyosarcoma (FKHR); glucagon; glucagon receptor; glycogen 
phosphorylase; PPAR-Gamma Coactivator (PGC-1); Fructose-l,6-bisphosphatase; glucose- 

30 6-phosphatase; glucose-6-phosphate translocator; glucokinase inhibitory regulatory protein; 
or phosphoenolpyruvate carboxykinase (PEPCK), such as to inhibit hepatic glucose 
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production in a mammal, such as a human, such as for the treatment of diabetes. In another 
embodiment, an iRNA agent targeted to the liver can target Factor V, e.g., the Leiden Factor 
V allele, such as to reduce the tendency to form a blood clot. An iSNA agent targeted to the 
liver can include a sequence which targets hepatitis virus (e.g.. Hepatitis A, B, C, D, E, F, G, 
5 or H). For example, an iRNA agent of the invention can target any one of the nonstructural 
proteins of HCV: NS3, 4A, 4B, 5A, or 5B. For the treatment of hepatitis B, an iRNA agent 
can target the protein X (HBx) gene, for example. 

Preferred ligands on RRMSs include folic acid, glucose, cholesterol, cholic acid, 
Vitamin E, Vitamin K, or Vitamin A. 

10 Definitions 

The term ''halo" refers to any radical of fluorine, chlorine, bromine or iodine. 
The term "alkyl" refers to a hydrocarbon chain that may be a straight chain or 
branched chain, containing the indicated number of carbon atoms. For example, C1-C12 alkyl 
indicates that the group may have from 1 to 12 (inclusive) carbon atoms in it. The term 

15 "haloalkyl" refers to an alkyl in which one or more hydrogen atoms are replaced by halo, and 
includes alkyl moieties in which all hydrogens have been replaced by halo (e.g., 
perfluoroalkyl). Alkyl and haloalkyl groups may be optionally inserted with O, N, or S. The 
terms "aralkyl" refers to an alkyl moiety in which an alkyl hydrogen atom is replaced by an 
aryl group. Aralkyl includes groups in which more than one hydrogen atom has been 

20 replaced by an aryl group. Examples of "aralkyl" include benzyl, 9-fluorenyl, benzhydryl, 
and trityl groups. 

The term "alkenyl" refers to a straight or branched hydrocarbon chain containing 2-8 
carbon atoms and characterized in having one or more double bonds. Examples of a typical 
alkenyl include, but not limited to, allyl, propenyl, 2-butenyl, 3-hexenyl and 3-octenyl 
25 groups. The term "alkynyl" refers to a straight or branched hydrocarbon chain containing 2-8 
carbon atoms and characterized in having one or more triple bonds. Some examples of a 
typical alkynyl are ethynyl, 2-propynyl, and 3-methylbutynyl, and propargyl. The sp^ and 
sp^ carbons may optionally serve as the point of attachment of the alkenyl and alkynyl 
groups, respectively. 
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The term "alkoxy'' refers to an -0-alkyl radical. The term "aminoalkyl" refers to an 
alkyl substituted with an aminoThe term "mercapto" refers to an -SH radical. The term 
*i;hioalkoxy" refers to an -S-alkyl radical. 

The term "alkylene" refers to a divalent alkyl {te,, -R-), e.g., -CH2-, -CHiCHi-, and - 
6 CH2CH2CH2-. The term "alkylenedioxo" refers to a divalent species of the structure -0-R- 
0-, in which R represents an alkylene. 

The term "aryl" refers to an aromatic monocyclic, bicyclic, or tricyclic hydrocarbon 
ring system, wherein any ring atom capable of substitution can be substituted by a 
substituent. Examples of aryl moieties include, but are not limited to, phenyl, naphthyl, and 
10 anthracenyl 

The term "cycloalkyl" as employed herein includes saturated cyclic, bicyclic, 
tricyclic,or polycyclic hydrocarbon groups having 3 to 12 carbons, wherein any ring atom 
capable of substitution can be substituted by a substituent. The cycloalkyl groups herein 
described may also contain fused rings. Fused rings are rings that share a common carbon- 

16 carbon bond. Examples of cycloalkyl moieties include, but are not limited to, cyclohexyl, 
adamantyl, and norbomyl. 

The term "heterocyclyl" refers to a nonaromatic 3-10 membered monocyclic, 8-12 
membered bicyclic, or 1 1-14 membered tricyclic ring system having 1 -3 heteroatoms if 
monocyclic, 1-6 heteroatoms if bicyclic, or 1-9 heteroatoms if tricyclic, said heteroatoms 

20 selected from O, N, or S (e.g., carbon atoms and 1-3, 1-6, or 1-9 heteroatoms of N, O, or S if 
monocyclic, bicyclic, or tricychc, respectively), wherein any ring atom capable of 
substitution can be substituted by a substituent. The heterocyclyl groups herein described 
may also contain fused rings. Fused rings are rings that share a conunon carbon-carbon 
bond. Examples of heterocyclyl include, but are not limited to tetrahydrofuranyl, 

25 tetrahydropyranyl, piperidinyl, morpholino, pyrrolinyl and pyrrolidinyl. 

The term "heteroaryl" refers to an aromatic 5-8 membered monocyclic, 8-12 
membered bicyclic, or 1 1-14 membered tricyclic ring system having 1-3 heteroatoms if 
monocyclic, 1-6 heteroatoms if bicyclic, or 1-9 heteroatoms if tricyclic, said heteroatoms 
selected from O, N, or S (e.g., carbon atoms and 1-3, 1-6, or 1-9 heteroatoms of N, 0, or S if 

30 monocyclic, bicyclic, or tricyclic, respectively), wherem any ring atom capable of 
substitution can be substituted by a substituent. 
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The tenn "oxo" refers to an oxygen atom, which forms a carbonyl when attached to 
carbon, an N-oxide when attached to nitrogen, and a sulfoxide or sulfone when attached to 
sulfur. 

The term "acyF' refers to an alkylcarbonyl, cycloalkylcarbonyl, arylcarbonyl, 
5 heterocyclylcarbonyl, or heteroarylcarbonyl substituent, any of which may be further 
substituted by substituents. 

The term "substituents" refers to a group "substituted" on an alkyl, cycloalkyl, 
alkenyl, alkynyl, heterocyclyl, heterocycloalkenyl, cycloalkenyl, aryl, or heteroaryl group at 
any atom of that group. Suitable substituents include, without limitation, alkyl, alkenyl, 
10 alkynyl, alkoxy, halo, hydroxy, cyano, nitro, amino, SO3H, sulfate, phosphate, 

perfluoroalkyl, perfluoroalkoxy, methylenedioxy, ethylenedioxy, carboxyl, 0x0, thioxo, 
imino (alkyl, aryl, aralkyl), S(0)nalkyl (where n is 0-2), S(0)n aryl (where n is 0-2), S(0)n 
heteroaryl (where n is 0-2), S(0)n heterocyclyl (where n is 0-2), amine (mono-, di-, alkyl, 
cycloalkyl, aralkyl, heteroaralkyl, and combinations thereof), ester (alkyl, aralkyl, 

* 

15 heteroaralkyl), amide (mono-, di-, alkyl, aralkyl, heteroaralkyl, and combinations thereof), 
sulfonamide (mono-, di-, alkyl, aralkyl, heteroaralkyl, and combinations thereof), 
unsubstituted aryl, unsubstituted heteroaryl, unsubstituted heterocyclyl, and unsubstituted 
cycloalkyl. In one aspect, the substituents on a group are independently any one single, or 
any subset of the aforementioned substituents. 
20 The terms **adeninyl, cytosinyl, guaninyl, thyminyl, and uracilyP* and the like refer to 

radicals of adenine, cytosine, guanine, thymine, and uracil. 

As used herein, an "unusual" nucleobase can include any one of the following: 
2-methyladeninyl, 
N6-methyladeninyl, 
25 2-methylthio-N6-methyladeninyl, 

N6-isopentenyladeninyl, 
2-methylthio-N6-isopentenyladeninyl, 
N6-(cis-hydroxyisopentenyl)adeninyl, 
2-methylthio-N6-(cis-hydroxyisopentenyl) adeninyl, 
30 N6-glycinylcarbamoyladeninyl, 

N6-threonylcarbamoyladeninyl, 
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2-methylthio-N6-threoiiyl carbamoyladeninyl, 

N6-methyl-N6-tlveonylcarbamoyladeninyl, 

N6-hydroxynorvalylcarbamoyladeninyl, 

2- methylthio-N6-hydroxynorvalyl carbamoyladeninyl, 
N6,N6-dimethyiadeninyl, 

3- methylcytosinyl, 
S-methylcytosinyl, 
2-thiocytosinyl, 
5-fonnylcytosinyl, 



N4-inethylcytosinyl, 

5-hydroxymethyIcytosinyl, 

1 -methylguaninyl, 

N2-methylguaninyl, 

7-methylguamnyl, 

N2,N2-dimethyIguaninyl, 



NH 
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NHCOOCH3 NHCOOCH3 NHCOOCH3 

HsCOOC-A, HgCOOcA^oH HaCOOcA^oOH 





0 



/ Q 



' All 
CH3 CH3 




N2,7-diinethylguaninyl, 
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N2,N2,7-trimethylguaninyl, 

1- methylguaninyl, 
7«cyano-7-deazaguaninyl, 
7-aniinomethyl-7-deazaguaninyl, 
pseudoijracilyl, 
dihydrouracilyl, 
S-methyluracilyl, 

1 -methylpseudouracily 1, 

2- thioTiracilyl, 

4- thiouraciIyi, 

2- thiothymmyl 

5- methyl-2-thiouracilyl, 

3- (3-amino-3-carboxypropyl)uracilyl, 
5-hydroxyuraciIyl, 
5-methoxyuracilyl, 

xiracilyl 5-oxyacetic acid, 

uracilyl 5-oxyacetic acid methyl ester, 

5-(carboxyhydroxyinethyl)uracilyl, 

5-(carboxyhydroxyinethyl)uracilyl methyl ester, 

5-methoxycarbonylmethylm'acilyl, 

5-methoxycarbonylmethyl-2-thiouracilyl, 

5-aminomethyl-2-thiouracilyl, 

5-methylaminomethyluracilyl, 

5-methylaminomethyl-2-thiouracilyl, 

5-methylaminomethyl-2-selenouracilyl, 

5-carbamoylmethyluracilyl, 

5-carboxymethylaminomethyluraciIyI, 

5 -carboxymethylaminomethyl-2-thiouracilyl, 

3-methyluracilyl, 

1 -methyl-3-(3-amino-3-carboxypropyl) pseudouracilyl, 
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5-carboxymethyluracilyl, 
5-metbyldihydrouracilyl, or 
3-methylpseudouracilyl 

Asymmetrical Modifications 

In one aspect, the invention features an iRNA agent which can be asymmetrically 
modified as described herein. 

In addition, the invention includes iRNA agents having asymmetrical modifications 
and another element described herein. E.g., the invention includes an iRNA agent described 
herein, e.g., a palindromic iRNA agent, an iRNA agent having a non canonical pauring, an 
iRNA agent which targets a gene described herein, e.g., a gene active in the liver, an iRNA 
agent having an architecture or structure described herein, an iRNA associated with an 
amphipathic delivery agent described herein, an iRNA associated with a drug delivery 
module described herein, an iRNA agent administered as described herein, or an iRNA agent 
formulated as described herein, which also incorporates an asymmetrical modification. 

iRNA agents of the invention can be asymmetrically modified. An asymmetrically 
modified iRNA agent is one m which a strand has a modification which is not present on the 
other strand. An asymmetrical modification is a modification found on one strand but not on 
the other strand. Any modification, e.g., any modification described herein, can be present as 
an asymmetrical modification. An asymmetrical modification can confer any of the desired 
properties associated with a modification, e.g., those properties discussed herein. E,g., an 
asymmetrical modification can: confer resistance to degradation, an alteration in half life; 
target the iRNA agent to a particular target, e.g., to a particular tissue; modulate, e.g., 
increase or decrease, the aflBnity of a strand for its complement or target sequence; or hinder 
or promote modification of a terminal moiety, e.g., modification by a kinase or other 
enzymes involved in the RISC mechanism pathway. The designation of a modification as 
having one property does not mean that it has no other property, e.g., a modification referred 
to as one which promotes stabilization might also enhance targeting. 

While not wishing to be bound by theory or any particular mechanistic model, it is 
believed that asymmetrical modification allows an iRNA agent to be optimized in view of the 
different or "asymmetrical" fimctions of the sense and antisense strands. For example, both 
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Strands can be modified to increase nuclease resistance, however, since some changes can 
inhibit RISC activity, these changes can be chosen for the sense stand . In addition, since 
some modifications, e.g., targeting moieties, can add large bulky groups that, e.g., can 
interfere with the cleavage activity of the RISC complex, such modifications are preferably 
5 placed on tlie sense strand. Thus, targeting moieties, especially bulky ones (e.g. cholesterol), 
are preferentially added to the sense strand* In one embodiment, an asymmetrical 
modification in which a phosphate of the backbone is substituted with S, e.g., a 
phosphorothioate modification, is present in the antisense strand, and a 2' modification, e.g., 
2' OMe is present in the sense strand. A targeting moiety can be present at either (or both) 

10 the 5* or 3' end of the sense strand of the iRNA agent. In a preferred example, a P of the 

backbone is replaced with S in the antisense strand, 2*0Me is present in the sense strand, and 
a targeting moiety is added to either the 5* or 3' end of the sense strand of the iRNA agent. 

In a preferred embodiment an asymmetrically modified iRNA agent has a 
modification on the sense strand which modification is not found on the antisense strand and 

15 the antisense strand has a modification which is not found on the sense strand. 

Each strand can include one or more asymmetrical modifications. By way of 
example: one strand can include a first asymmetrical modification which confers a first 
property on the iRNA agent and the other strand can have a second asymmetrical 
modification which confers a second property on the iRNA. E.g., one strand, e.g., the sense 

20 strand can have a modification which targets tiie iRNA agent to a tissue, and the other strand, 
e.g., the antisense strand, has a modification which promotes hybridization with the target 
gene sequence. 

In some embodiments both strands can be modified to optimize the same property, 
e.g., to increase resistance to nucleolytic degradation, but different modifications are chosen 
25 for the sense and the antisense strands, e.g., because the modifications affect other properties 
as well. E.g., since some changes can affect RISC activity these modifications are chosen for 
the sense strand. 

In an embodiment one strand has an asymmetrical 2' modification, e.g., a 2' OMe 
modification, and the other strand has an asymmetrical modification of the phosphate 
30 backbone, e.g., a phosphorothioate modification. So, in one embodiment the antisense strand 
has an asymmetrical 2' OMe modification and the sense strand has an asymmetrical 
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phosphorothioate modification (or vice versa). In a particularly preferred embodiment the 
RNAi agent will have asymmetrical 2*-0 alkyl, preferably, 2'-0Me modifications on the 
sense strand and asymmetrical backbone P modification, preferably a phosphothioate 
modification in the antisense strand. There can be one or multiple 2'-0Me modifications, 
5 e.g., at least 2, 3, 4, 5, or 6, of the subunits of the sense strand can be so modified. There can 
be one or multiple phosphorothioate modifications, e.g., at least 2, 3, 4, 5, or 6, of the 
subunits of the antisense strand can be so modified. It is preferable to have an iRNA agent 
wherein there are multiple 2'-0Me modifications on the sense strand and multiple 
phophorothioate modifications on the antisense strand. All of the subunits on one or both 

10 strands can be so modified, A particularly preferred embodiment of multiple asymmetric 
modification on both strands has a duplex region about 20-21, and preferably 19, subunits in 
length and one or two 3' overhangs of about 2 subunits in length. 

Asymmetrical modifications are useful for promoting resistance to degradation by 
nucleases, e.g., endonucleases. iRNA agents can include one or more asymmetrical 

15 modifications which promote resistance to degradation. In preferred embodiments the 
modification on the antisense strand is one which will not interfere with silencing of the 
target, e.g., one which will not interfere with cleavage of the target. Most if not all sites on a 
strand are vulnerable, to some degree, to degradation by endonucleases. One can determine 
sites which are relatively vuhierable and insert asymmetrical modifications which inhibit 

20 degradation. It is often desirable to provide asymmetrical modification of a U A site in an 
iRNA agent, and in some cases it is desirable to provide the UA sequence on both strands 
with asymmetrical modification. Examples of modifications which inhibit endonucleolytic 
degradation can be found herein. Particularly favored modifications include: T 
modification, e.g., provision of a T OMe moiety on the U, especially on a sense strand; 

25 modification of the backbone, e.g., with the replacement of an O with an S, in the phosphate 
backbone, e.g., the provision of a phosphorothioate modification, on the U or the A or both, 
especially on an antisense strand; replacement of the U with a C5 amino Imker; replacement 
of the A with a G (sequence changes are preferred to be located on the sense strand and not 
the antisense strand); and modification of the at the 2', 6', 7', or 8' position. Preferred 

30 embodiments are those in which one or more of these modifications are present on the sense 
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but not the antisense strand, or embodiments where the antisense strand has fewer of such 
modifications. 

Asymmetrical modification can be used to inhibit degradation by exonucleases. 
Asymmetrical modifications can include those in which only one strand is modified as well 
as those in which both are modified. In preferred embodiments the modification on the 
antisense strand is one which will not interfere with silencing of the target, e.g., one which 
will not interfere with cleavage of the target. Some embodiments will have an asymmetrical 
modification on the sense strand, e.g., in a 3' overhang, e.g., at the 3' terminus, and on the 
antisense strand, e.g., in a 3' overhang, e.g., at the 3' terminus. If the modifications introduce 
moieties of different size it is preferable that the larger be on the sense strand. If the 
modifications introduce moieties of different charge it is preferable that the one with greater 
charge be on the sense strand. 

Examples of modifications which inhibit exonucleolytic degradation can be found 
herein. Particularly favored modifications include: 2' modification, e.g., provision of a 2' 
OMe moiety in a 3 ' overhang, e.g., at the 3 * terminus (3 ' terminus means at the 3 ' atom of 
the molecule or at the most 3' moiety, e.g., the most 3' P or 2' position, as indicated by the 
context); modification of the backbone, e.g., with the replacement of a P with an S, e.g., the 
provision of a phosphorothioate modification, or the use of a metiiylated P in a 3' overhang, 
e.g., at the 3' terminus; combination of a T modification, e.g., provision of a 2' O Me 
moiety and modification of tiie backbone, e.g., with the replacement of a P with an S, e.g., 
the provision of a phosphorothioate modification, or the use of a methylated P, in a 3' 
overhang, e.g., at the 3' terminus; modification with a 3' alkyl; modification with an abasic 
pyrolidine in a 3' overhang, e.g., at the 3' terminus; modification with naproxene, ibuprofen, 
or other moieties which inhibit degradation at the 3' terminus. Preferred embodiments are 
those in which one or more of these modifications are present on the sense but not the 
antisense strand, or embodunents where the antisense strand has fewer of such modifications. 

Modifications, e.g., those described herein, which affect targeting can be provided as 
asymmetrical modifications. Targeting modifications which can inhibit silencing, e.g., by 
inhibiting cleavage of a target, can be provided as asymmetrical modifications of the sense 
strand. A biodistribution altering moiety, e.g., cholesterol, can be provided in one or more, 
e.g., two, asymmetrical modifications of the sense strand. Targeting modifications which 
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introduce moieties having a relatively large molecular weight, e.g., a molecular weight of 
more than 400, 500, or 1000 daltons, or which introduce a charged moiety (e.g., having more 
than one positive charge or one negative charge) can be placed on the sense strand. 

Modifications, e.g., those described herein, which modulate, e.g., increase or 
decrease, the affinity of a strand for its compliment or target, can be provided as 
asymmetrical modifications. These include: 5 methyl U; 5 methyl C; pseudouridine. Locked 
nucleic acids ,2 thio U and 2'-amino-A. In some embodiments one or more of these is 
provided on the antisense strand, 

iRNA agents have a defined structure, with a sense strand and an antisense strand, 
and in many cases short single strand overhangs, e.g., of 2 or 3 nucleotides are present at one 
or both 3' ends. Asymmetrical modification can be used to optimize the activity of such a 
structure, e.g., by being placed selectively within the iRNA. E.g., the end region of the iRNA 
agent defined by the 5* end of the sense strand and the 3 'end of the antisense strand is 
hnportant for fimction. This region can include the terminal 2, 3, or 4 paired nucleotides and 
any 3' overhang. In preferred embodiments asymmetrical modifications which result in one 
or more of the following are used: modifications of the 5' end of the sense strand which 
inhibit kinase activation of the sense strand, including, e.g., attachments of conjugates which 
target the molecule or the use modifications which protect against 5' exonucleolytic 
degradation; or modifications of either strand, but preferably the sense strand, which enhance 
binding between the sense and antisense strand and thereby promote a "tight" structure at this 
end of the molecule. 

The end region of the iRNA agent defmed by the 3* end of the sense strand and the 
5 'end of the antisense strand is also important for fimction. This region can include the 
terminal 2, 3, or 4 paired nucleotides and any 3' overhang. Preferred embodiments mclude 
asymmetrical modifications of either strand, but preferably the sense strand, which decrease 
bindmg between the sense and antisense strand and thereby promote an "open" structure at 
this end of the molecule. Such modifications mclude placing conjugates which target the 
molecule or modifications which promote nuclease resistance on the sense strand in this 
region. Modification of the antisense strand which inhibit kinase activation are avoided in 
preferred embodiments. 
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Exemplary modifications for asymmetrical placement in the sense strand include the 
following; 

(a) backbone modifications, e.g., modification of a backbone P, including 
replacement of P with S, or P substituted with alkyl or allyl, e.g.. Me, and dithioates (S-P=S); 
these modifications can be used to promote nuclease resistance; 

(b) 2'-0 alkyl, e.g., 2'-0Me, 3'-0 alkyl, e.g., 3'-0Me (at terminal and/or internal 
positions); these modifications can be used to promote nuclease resistance or to enhance 
binding of the sense to the antisense strand, the 3' modifications can be used at the 5' end of 
the sense strand to avoid sense strand activation by RISC; 

(c) 2'-5' linkages (with 2'-H, 2'-0H and 2'-0Me and with P=0 or P=S) tiiese 
modifications can be used to promote nuclease resistance or to inhibit binding of the sense to 
the antisense strand, or can be used at the 5' end of the sense strand to avoid sense strand 
activation by RISC; 

(d) L sugars (e.g., L ribose, L-arabinose with 2'-H, 2'-0H and 2*-OMe); these 
modifications can be used to promote nuclease resistance or to inhibit binding of the sense to 
the antisense strand, or can be used at the 5' end of the sense strand to avoid sense strand 
activation by RISC; 

(e) modified sugars (e.g., locked nucleic acids (LNA*s), hexose nucleic acids 
(HNA's) and cyclohexene nucleic acids (CeNA's)); these modifications can be used to 
promote nuclease resistance or to inhibit binding of the sense to the antisense strand, or can 
be used at the 5' end of tiie sense strand to avoid sense strand activation by RISC; 

(f) nucleobase modifications (e.g., C-5 modified pyrimidines, N-2 modified purines, 
N-7 modified purines, N-6 modified purines), these modifications can be used to promote 
nuclease resistance or to enhance binding of the sense to the antisense strand; 

(g) cationic groups and Zwitterionic groups (preferably at a terminus), these 
modifications can be used to promote nuclease resistance; 

(h) conjugate groups (preferably at terminal positions), e,g., naproxen, biotin, 
cholesterol, ibuprofen, folic acid, peptides, and carbohydrates; these modifications can be 
used to promote nuclease resistance or to target the molecule, or can be used at the 5' end of 
the sense strand to avoid sense strand activation by RISC. 
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Exemplary modifications for asymmetrical placement in the antisense strand include 
the following: 

(a) backbone modifications, e.g., modification of a backbone P, including 
replacement of P with S, or P substituted witii alkyl or allyl, e.g.. Me, and ditiiioates (S-P=S); 
6 (b) 2'-0 alkyl, e.g., 2*-0Me, (at terminal positions); 

(c) 2'-5' linkages (with 2'-H, 2' -OH and 2'-0Me) e.g., terminal at the 3' end); e.g., 
witii P=0 or P=S preferably at the 3'-end, these modifications are preferably excluded from 
the 5' end region as they may interfere with RISC enzyme activity such as kinase activity; 

(d) L sugars (e.g, L ribose, L-arabinose witii 2'-H, 2'-0H and 2''0Me); e.g., terminal 
10 at the 3' end; e.g., with P=0 or P=S preferably at the 3'-end, these modifications are 

preferably excluded firom the 5' end region as they may interfere with kinase activity; 

(e) modified sugars (e.g., LNA's, HNA's and CeNA's); tiiese modifications are 
preferably excluded from the 5* end region as they may contribute to unwanted 
enhancements of paring between tiie sense and antisense strands, it is often prefened to have 

15 a "loose" structure in the 5' region, additionally, they may interfere with kinase activity; 

(f) nucleobase modifications (e.g., C-5 modified pyrimidines, N-2 modified purines, 
N-7 modified purines, N-6 modified purines); 

(g) cationic groups and Zwitterionic groups (preferably at a terminus); 

conjugate groups (preferably at terminal positions), e,g., naproxen, biotin, cholesterol, 
20 ibuprofen, folic acid, peptides, and carbohydrates, but bulky groups or generally groups 

which inhibit RISC activity should are less preferred. 

The 5' -OH of the antisense strand should be kept firee to promote activity. In some 

preferred embodiments modifications that promote nuclease resistance should be included at 

the 3' end, particularly in the 3' overhang. 
25 In another aspect, the invention features a method of optimizing, e.g., stabilizing, an 

iRNA agent. The method includes selecting a sequence having activity, introducing one or 

more asymmetric modifications into the sequence, wherein the introduction of the 

asymmetric modification optimizes a property of the iRNA agent but does not result in\a 

decrease in activity. 

30 The decrease in activity can be less than a preselected level of decrease. In 

preferred embodiments decrease in activity means a decrease of less than 5, 10, 20, 40, or 
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50 % activity, as compared with an otherwise similar iRNA lacking the introduced 
modification. Activity can, e.g., be measured in vivo, or in vitro, with a result in either being 
sufficient to demonstrate the required maintenance of activity. 

The optimized property can be any property described herein and in particular the 
5 properties discussed in the section on asymmetrical modifications provided herein. The 
modification can be any asymmetrical modification, e.g., an asymmetric modification 
described in the section on asymmetrical modifications described herein. Particularly 
preferred asymmetric modifications are 2'-0 alkyl modifications, e.g., 2'-0Me 
modifications, particularly in the sense sequence, and modifications of a backbone O, 

10 particularly phosphorothioate modifications, in the antisense sequence. 

In a preferred embodiment a sense sequence is selected and provided with an 
asymmetrical modification, while in other embodiments an antisense sequence is selected 
and provided with an asymmetrical modification. In some embodiments both sense and 
antisense sequences are selected and each provided with one or more asymmetrical 

15 modifications. 

Multiple asymmetric modifications can be introduced into either or both of the sense 
and antisense sequence. A sequence can have at least 2, 4, 6, 8, or more modifications and 
all or substantially all of the monomers of a sequence can be modified. 
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Table: 2. Some examples of Asymmetric Modification 

This table shows examples having strand I with a selected modification and strand II 
with a selected modification. 



Strand I 


Strand II 


Nuclease Resistance (e.g. 2 -OMe) 


Biodistribution (e.g., P=S) 


Biodistribution conjugate 
(e.g. Lipophile) 


Protein Binding Functionalitv 
(e.g. Naproxen) 


Tissue Distribution Functionality 

IT 

(e.g. Carbohydrates) 


Cell Targeting Functionalitv 

(e.g. Folate for cancer cells) 


Tissue Distribution Functionality 
(e.g. Liver Cell Targeting 
Carbohydrates) 


Fusogenic Functionality 
(e.g. Polyethylene imines) 


Cancer Cell Targeting 
(e. g. RGD peptides and imines) 


Fusogenic Functionality 
(e.g. peptides) 


Nuclease Resistance (e.g. 2'-0Me) 


Increase in binding Affinity (5-Me-C, 5-Me-U, 2- 
thio-U, 2-amino-A, G-clamp, LNA) 


Tissue Distribution Fimctionality 


RISC activity improving Functionality 


Helical conformation changing 
Functionalities 


Tissue Distribution Functionality 
(P=S; lipophile, carbohydrates) 
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Z-X-Y Architecture 

In one aspect, the mvention features an iRNA agent which can have a Z-X-Y 
5 architecture or structure such as those described herein and those described in copending, co- 
owned United States Provisional Application Serial No. 60/510,246 (Attorney Docket No. 
14174-079P02), filed on October 9, 2003, which is hereby incorporated by reference, and in 
copending, co-owned United States Provisional Application Serial No. 60/510,318 (Attorney 
Docket No. 14174-079P03), filed on October 10, 2003, which is hereby incorporated by 
10 reference. 

In addition, the invention includes iRNA agents having a Z-X-Y structure and another 
element described herein. E.g., the invention includes an iRNA agent described herein, e.g., 
a palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA associated 
1 5 with an amphipathic delivery agent described herein, an iRNA associated with a drug 
delivery module described herein, an iRNA agent administered as described herein, or an 
iRNA agent formulated as described herein, which also incorporates a Z-X-Y architecture. 

The invention provides an iRNA agent having a first segment, the Z region, a second 
segment, the X region, and optionally a third region, the Y region: 

20 

Z— X— Y. 



It may be desirable to modify subunits in one or both of Zand/or Y on one hand and X 
on the other hand. In some cases they will have the same modification or the same class of 
25 modification but it will more often be the case that the modifications made in Z and/or Y will 
differ from those made in X. 

The Z region typically includes a terminus of an iRNA agent. The length of the Z 
region can vary, but will typically be from 2-14, more preferably 2-10, subunits in length. It 
typically is single stranded, i.e., it will not base pair with bases of another strand, though it 
30 may in some embodiments self associate, e.g., to form a loop structure. Such structures can 
be formed by the end of a strand loopmg back and forming an intrastrand duplex. E.g., 2, 3, 
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4, 5 or more intra-strand bases pairs can form, having a looped out or connecting region, 
typically of 2 or more subunits which do not pair. This can occur at one or both ends of a 
strand. A typical embodiment of a Z region is a single strand overhang, e.g., an over hang of 
the length described elsewhere herein. The Z region can thus be or include a 3' or 5' 
termuial single strand. It can be sense or antisense strand but if it is antisense it is preferred 
that it is a 3- overhang. Typical inter-subunit bonds in the Z region include: P=0; P=S; S- 
P=S; P-NRi; and P-BR2. Chiral P=X, where X is S, N, or B) inter-subunit bonds can also be 
present. (These inter-subunit bonds are discussed in more detail elsewhere herein.) Other 
preferred Z region subunit modifications (also discussed elsewhere herein) can include: 3'- 
OR, 3'SR, 2'-0Me, 3'-0Me, and 2'OH modifications and moieties; alpha configuration 
bases; and 2' arabino modifications. 

The X region will in most cases be duplexed, in the case of a single strand iRNA 
agent, with a corresponding region of the single strand, or m the case of a double stranded 
iRNA agent, with the corresponding region of the other strand. The length of the X region 
can vary but will typically be between 10-45 and more preferably between 15 and 35 
subunits. Particularly preferred region X's will include 17, 18, 19, 29, 21, 22, 23, 24, or 25 
nucleotide pairs, though other suitable lengths are described elsewhere herein and can be 
used. Typical X region subunits include 2' -OH subunits. hi typical embodiments phosphate* 
inter-subunit bonds are preferred while phophorothioate or non-phosphate bonds are absent. 
Other modifications preferred in the X region mclude: modifications to improve binding, 
e.g., nucleobase modifications; cationic nucleobase modifications; and C-5 modified 
pyrimidines, e.g., allylamines. Some embodiments have 4 or more consecutive 2'OH 
subunits. While the use of phosphorothioate is sometimes non preferred they can be used if 
they connect less than 4 consecutive 2'OH subunits. 

The Y region will generally conform to the the parameters set out for the Z regions. 
However, the X and Z regions need not be the same, different types and numbers of 
modifications can be present, and mfact, one will usually be a 3* overhang and one will 
usually be a 5' overhang. 

In a preferred embodunent the iRNA agent will have a Y and/or Z region each having 
ribonucleosides m which the 2*-0H is substituted, e.g., with 2'-0Me or other alkyl; and an X 
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region that includes at least four consecutive ribonucleoside subunits in which the 2' -OH 
remains unsubstituted. 

The subunit linkages (the Imkages between subunits) of an iRNA agent can be 
modified, e.g., to promote resistance to degradation. Numerous examples of such 
modifications are disclosed herein, one example of which is the phosphorothioate linkage. 
These modifications can be provided bewteen the subunits of any of the regions, Y, X, and Z. 
However, it is preferred that their occureceis minimized and in particular it is preferred that 
consecutive modified linkages be avoided. 

In a preferred embodiment the iRNA agent will have a Y and Z region each having 
ribonucleosides in which the 2'-0H is substituted, e.g., with 2'-0Me; and an X region that 
includes at least four consecutive subunits, e.g., ribonucleoside subunits in which the 2'-0H 
remains unsubstituted. 

As mentioned above, the subunit linkages of an iRNA agent can be modified, e.g., to 
promote resistance to degradation. These modifications can be provided between the 
subunits of any of the regions, Y, X, and Z. However, it is preferred that they are minimized 
and in particular it is preferred that consecutive modified linkages be avoided. 

Thus, in a preferred embodiment, not all of the subunit linkages of the iRNA agent 
are modified and more preferably the maximum number of consecutive subunits linked by 
other than a phospodiester bond will be 2, 3, or 4. Particulary preferred iRNA agents will not. 
have four or more consecutive subunits, e.g., 2'-hydroxyl ribonucleoside subunits, in which 
each subunits is joined by modified linkages - i.e. linkages that have been modified to 
stabilize them firom degradation as compared to the phosphodiester linkages that naturally 
occur in RNA and DNA. 

It is particularly preferred to minimize the occurrence in region X. Thus, in preferred 
embodiments each of the nucleoside subunit linkages in X will be phosphodiester linkages, 
or if subunit linkages in region X are modified, such modifications will be minimized. E.g., 
although the Y and/or Z regions can include inter subunit linkages which have been 
stabilized against degradation, such modifications will be minimized in the X region, and in 
particular consecutive modifications will be minimized. Thus, in preferred embodiments the 
maximum number of consecutive subunits linked by other than a phospodiester bond will be 
2, 3, or 4. Particulary preferred X regions will not have four or more consecutive subunits, 
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e.g., 2'-hyciroxyl ribonucleoside subunits, in which each subunits is joined by modified 
linkages - i.e. linkages that have been modified to stabilize them from degradation as 
compared to the phosphodiester linkages that naturally occur in RNA and DNA. 

In a preferred embodiment Y and /or Z will be free of phosphorothioate linkages, 
though either or both may contain other modifications, e.g., other modifications of the 
subunit linkages. 

In a preferred embodiment region X, or in some cases, the entire iKNA agent, has no 
more than 3 or no more than 4 subunits having identical T moieties. 

In a preferred embodiment region X, or in some cases, the entire iRNA agent, has no 
more than 3 or no more than 4 subunits having identical subunit linkages. 

In a preferred embodiment one or more phosphorothioate linkages (or other 
modifications of the subunit linkage) are present in Y and/or Z, but such modified linkages 
do not connect two adjacent subunits, e.g., nucleosides, having a T modification, e.g., a 2'- 
0-alkyl moiety. E.g., any adjacent 2'-0-alkyl moieties in the Y and/or Z, are connected by a 
linkage other than a a phosphorothioate linkage. 

In a preferred embodiment each of Y and/or Z independently has only one 
phosphorothioate linkage between adjacent subunits, e.g., nucleosides, having a T 
modification, e.g., 2'-0-alkyl nucleosides. If there is a second set of adjacent subunits, e.g., 
nucleosides, having a 2' modification, e.g., 2'-0-alkyl nucleosides, in Y and/or Z that 
second set is connected by a linkage other than a phosphorothioate linkage, e.g., a modified 
linkage other than a phosphorothioate linkage. 

In a prefered embodiment each of Y and/orZ independentiy has more than one 
phosphorothioate linkage connecting adjacent pairs of subunits, e.g., nucleosides, having a T 
modification, e.g., 2'"0-alkyl nucleosides, but at least one pair of adjacent subunits, e.g., 
nucleosides, having a T modification, e.g., 2'-0-alkyl nucleosides, are be connected by a 
linkage other than a phosphorotiiioate linkage, e.g., a modified linkage other than a 

phosphorothioate linkage. 

In a prefered embodiment one of the above recited limitation on adjacent subunits in 
Y and or Z is combhied with a limitation on the subunits in X. E.g., one or more 
phosphorothioate linkages (or other modifications of the subunit linkage) are present in Y 
and/or Z, but such modified linkages do not connect two adjacent subunits, e.g., nucleosides, 
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having a T modification, e.g., a 2'-0"alkyi moiety. E.g., any adjacent 2'-0-alkyl moieties in 
the Y and/or Z, are connected by a linkage other than a a phosporothioate linkage. In 
addition, the X region has no more than 3 or no more than 4 identical subunits, e.g., subunits 
having identical 2' moieties or the X region has no more than 3 or no more than 4 subunits 
having identical subxmit linkages. 

A Y and/or Z region can include at least one, and preferably 2, 3 or 4 of a 
modification disclosed herein. Such modifications can be chosen, independently, fi:om any 
modification described herein, e.g., from nuclease resistant subunits, subunits with modified 
bases, subunits with modified intersubunit linkages, subunits with modified sugars, and 
subunits linked to another moiety, e.g., a targeting moiety. In a preferred embodiment more 
than 1 of such subunits can be present but in some emobodiments it is prefered that no more 
than 1, 2, 3, or 4 of such modifications occur, or occur consecutively. In a preferred 
embodiment the firequency of the modification will differ between Yand /or Z and X, e.g., the 
modification will be present one of Y and/or Z or X and absent in the other. 

An X region can include at least one, and preferably 2, 3 or 4 of a modification 
disclosed herein. Such modifications can be chosen, independently, fi-om any modification 
desribed herein, e.g., from nuclease resistant subunits, subunits with modified bases, subunits 
with modified intersubunit Imkages, subunits with* modified sugars, and subunits linked to 
another moiety, e.g., a targeting moiety. In a preferred embodiment more than 1 of such 
subunits can b present but in some emobodiments it is prefered that no more than 1 , 2, 3, or 4 
of such modifications occur, or occur consecutively. 

An RRMS (described elswhere herein) can be introduced at one or more points in one 
or both stiands of a double-stranded iRNA agent. An RRMS can be placed in a Y and/or Z 
region, at or near (within 1, 2, or 3 positions) of the 3' or 5' end of the sense strand or at near 
(within 2 or 3 positions of) the 3' end of the antisense strand. In some embodiments it is 
preferred to not have an RRMS at or near (within 1 , 2, or 3 positions of) the 5* end of the 
antisense strand. An RRMS can be positioned in the X region, and will preferably be 
positioned in the sense strand or in an area of the antisense strand not critical for antisense 
binding to the target. 
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Differential Modification of Terminal Duplex Stability 
In one aspect, the invention features an iKNA agent which can have differential 
modification of terminal duplex stability (DMTDS). 

5 In addition, the invention includes iRNA agents having DMTDS and another element 

described herein. E.g., the invention includes an iRNA agent described herein, e.g., a 
palindromic iRNA agent, an iRNA agent having a non canonical pairing, an iRNA agent 
which targets a gene described herein, e.g., a gene active in the liver, an iRNA agent having 
an architecture or structure described herein, an iRNA associated with an amphipathic 

10 delivery agent described herein, an iRNA associated with a drug delivery module described 
herein, an iRNA agent administered as described herein, or an iRNA agent formulated as 
described herein, which also incorporates DMTDS. 

iRNA agents can be optimized by hicreasing the propensity of the duplex to 
disassociate or melt (decreasing the fi:ee energy of duplex association), m the region of the 5' 

15 end of the antisense strand duplex. This can be accomplished, e.g., by the inclusion of 

subunits which increase the propensity of the duplex to disassociate or meU in the region of 
the 5' end of the antisense strand. It can also be accomplished by the attachment of a ligand 
that increases the propensity of the duplex to disassociate of melt in the region of the 5 'end . 
While not wishing to be bound by theory, the effect may be due to promoting the effect of an 

20 enzyme such as helicase, for example, promoting the effect of the enzyme in the proximity of 
the 5' end of the antisense strand. 

The inventors have also discovered that iRNA agents can be optimized by decreasing 
the propensity of the duplex to disassociate or melt (increasing the firee energy of duplex 
association), in the region of the 3' end of the antisense strand duplex. This can be 

25 accomplished, e.g., by the inclusion of subunits which decrease the propensity of the duplex 
to disassociate or melt in the region of the 3' end of the antisense strand. It can also be 
accomplished by the attachment of ligand that decreases the propensity of the duplex to 
disassociate of melt in the region of the 5 'end. 

Modifications which increase the tendency of the 5' end of the duplex to dissociate 

30 can be used alone or in combination with other modifications described herein, e.g., with 
modifications which decrease the tendency of the 3' end of the duplex to dissociate. 
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Likewise, modifications which decrease the tendency of the 3' end of the duplex to dissociate 
can be used alone or m combination with other modifications described herem, e.g., with 
modifications which increase the tendency of the 5' end of the duplex to dissociate. 

Decreasing the stability of the AS 5 ' end of the duplex 
5 Subunit pairs can be ranked on the basis of their propensity to promote dissociation or 

melting (e.g., on the free energy of association or dissociation of a particular pairing, the 
simplest approach is to examine the pairs on an individual pak basis, though next neighbor or 
sunilar analysis can also be used). In terms of promoting dissociation: 

10 A:U is preferred over G:C; 

G:U is preferred over G:C; 

r.C is preferred over G:C (I==inosine); 

mismatches, e.g., non-canonical or other than canonical pairings (as described 
elsewhere herein) are preferred over canonical (A:T, A:U, G:C) pairings; 
16 pairings which include a universal base are preferred over canonical pairings. 



A typical ds iRNA agent can be diagrammed as follows: 

S 5' R1N1N2N3N4N5 [N] N.5 N-4 K3 N.2 N.i R2 3' 
AS 3' R3N1N2N3N4N5 [N] K5 N.3 N.2 N.i R4 5' 

S:AS P, P2 P3 P4 P5 [N] P.5P^P-3P.2P.i 5' 



S indicates the sense strand; AS indicates antisense strand; Ri indicates an optional 
25 (and nonpreferred) 5' sense strand overhang; R2 indicates an optional (though preferred) 3' 
sense overhang; R3 indicates an optional (though preferred) 3' antisense sense overhang; R4 
indicates an optional (and nonpreferred) 5' antisense overhang; N indicates subunits; [N] 
indicates that additional subunit pairs may be present; and Px, indicates a paring of sense Nx 
and antisense Nx. Overhangs are not shown in the P diagram. In some embodiments a 3' AS 
30 overhang corresponds to region Z, the duplex region corresponds to region X, and the 3' S 
strand overhang corresponds to region Y, as described elsewhere herein. (The diagram is not 
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meant to imply maximum or minimum lengths, on which guidance is provided elsewhere 
herein.) 

It is preferred that pairings which decrease the propensity to form a duplex are used at 
1 or more of the positions in the duplex at the end of the AS strand. The terminal pair (the 
5 most 3' pair in terms of the AS strand) is designated as P.i, and the subsequent pairing 

positions (going in the 3' direction in terms of the AS strand) in the duplex are designated, P. 
2, P.3, P^, P-55 and so on. The preferred region in which to modify to modulate duplex 
formation is at P.5 through P.i, more preferably P.4 through P,| , more preferably P.3 through 
P,i. Modification at P.j, is particularly preferred, alone or with modification(s) other 
10 position(s), e.g., any of the positions just identified. It is preferred that at least 1, and more 
preferably 2, 3, 4, or 5 of the pairs of one of the recited regions be chosen independently 
&om the group of: 

A:U 

15 G:U 

I:C 

mismatched pairs, e.g., non-canonical or other than canonical pairings or pairings 
which include a universal base. 

In preferred embodiments the change in subunit needed to achieve a pairing which 
20 promotes dissociation will be made in the sense strand, though in some embodiments the 
change will be made in the antisense strand. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P-4, are pairs 
which promote disociation. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P^, are A:U. 
25 In a preferred embodiment the at least 2, or 3, of the pairs in P.j, through P^, are G:U. 

In a preferred embodiment the at least 2, or 3, of the pairs in P.j, through P^, are I:C. 
In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P.4, are 
mismatched pairs, e.g., non-canonical or other than canonical pairings pairings. 

In a preferred embodiment the at least 2, or 3, of the pairs in P-i, through Puj, are 
30 pairings which include a universal base. 
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Increasing the stability of the AS 3 ' end of the duplex 

Subunit pairs can be ranked on the basis of their propensity to promote stability and 
inhibit dissociation or melting (e.g., on the free energy of association or dissociation of a 
particular pairing, the simplest approach is to examine the pairs on an individual pair basis, 
5 though next neighbor or similar analysis can also be used). In terms of promoting duplex 
stability: 

G:C is preferred over A:U 

Watson-Crick matches (A:T, A:U, G:C) are preferred over non-canonical or other 

1 0 than canonical pairings 

analogs that increase stability are preferred over Watson-Crick matches (A:T, A:U, 

G:C) 

2-amino-A:U is preferred over A:U 
2-thio U or 5 Me-thio-U:A are preferred over U: A 
15 G-clamp (an analog of C having 4 hydrogen bonds):G is preferred over C:G 

guanadinium-G-clamp:G is preferred over C:G 
psuedo uridineiA is preferred over U:A 

sugar modifications, e.g., 2' modifications, e.g., 2'F, ENA, or LNA, which enhance 
bmding are preferred over non-modified moieties and can be present on one or both strands 

20 to enhance stability of the duplex. It is preferred that pairings which increase the propensity 
to form a duplex are used at 1 or more of the positions in the duplex at the 3' end of the AS 
strand. The termmal pair (the most 3* pair in terms of the AS strand) is designated as Pi, and 
the subsequent pairing positions (going in the 5' dhrection in terms of the AS strand) in the 
duplex are designated, P2, P3, P4, P5, and so on. The preferred region in which to modify to 

25 modulate duplex formation is at P5 through P 1 , more preferably P4 through P i , more 
preferably P3 through Pi. Modification at Pi, is particularly preferred, alone or with 
mdification(s) at other position(s), e.g.,any of the positions just identified. It is preferred that 
at least 1, and more preferably 2, 3, 4, or 5 of the pairs of the recited regions be chosen 
independently from the group of: 
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10 



a pair having an analog that increases stability over Watson-Crick matches (A:T, 

A:U, G:C) 

2-amino-A:U 
2-thio U or 5 Me-thio-U: A 

G-clamp (an analog of C having 4 hydrogen bonds) :G 
guanadinium-G-clamp : G 
psuedo uridine:A 

a pair in which one or both subunits has a sugar modification, e.g., a 2' 
modification, e.g., 2'F, ENA, or LNA, which enhance binding. 



In a preferred embodiment the at least 2, or 3, of the pairs in P.i, through P.4, are pairs 
which promote duplex stability. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are G:C. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are a pair 
15 having an analog that increases stability over Watson-Crick matches. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are 2- 
amino-A:U. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are 2-thio 
U or 5 Me-thio-U:A. 

20 In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are G- 

clamp:G. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, are 
guanidiniimi-G-clampiG. 

In a preferred embodiment the at least 2, or 3, of the pairs in Pi, through P4, aie 
25 psuedo uridine:A. 

In a preferred embodunent the at least 2, or 3, of the pairs in Pi, through P4, are a pair 
in which one or both subunits has a sugar modification, e.g., a T modification, e.g., 2'F, 
ENA, or LNA, which enhances binding. 

G-clamps and guanidinium G-clamps are discussed in the following references: 
30 Holmes and Gait, "The Synthesis of 2'-0-Methyl G-Clamp Containing Oligonucleotides and 
Their Inhibition of the HIV-1 Tat-TAR Interaction," Nucleosides, Nucleotides & Nucleic 
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Acids, 22:1259-1262, 2003; Holmes et al, "Steric inhibition of human immunodeficiency 
virus type-1 Tat-dependent trans-activation in vitro and in cells by oligonucleotides 
containing 2'-0-methyl G-clamp ribonucleoside analogues," Nucleic Acids Research, 
3 1 :275 9-2768, 2003; Wilds, et al, "Structural basis for recognition of guanosine by a 
synthetic tricyclic cytosine analogue: Guanidinium G-clamp," Helyetica Chimica Acta, 
86:966-978, 2003; Rajeev, et al, "High-Affinity Peptide Nucleic Acid OUgomers 
Containing TricycUc Cytosme Analogues," Organic Letters, 4:4395-4398, 2002; Ausin, et 
al, "Synthesis of Amino- and Guanidino-G-Clamp PNA Monomers," Organic Letters, 
4:4073-4075, 2002; Maier et al, "Nuclease resistance of oligonucleotides containing the 
tricyclic cytosine analogues phenoxazine and 9-(2-aminoethoxy)-phenoxazine ("G-clamp") 
and origins of their nuclease resistance properties," Biochemistry, 41:1323-7, 2002; 
Flanagan, et al, "A cytosine analog that confers enhanced potency to antisense 
oligonucleotides," Proceedings Of The National Academy Of Sciences Of The United States 
Of America, 96:3513-8, 1999. 

Simultaneously decreasing the stability of the AS 5'end o f the duplex and increasing 
the stability of the AS 3' end of the duplex 

As is discussed above, an iRNA agent can be modified to both decrease the stability 
of the AS 5'end of the duplex and increase the stability of the AS 3' end of the duplex. This 
can be effected by combining one or more of the stability decreasing modifications in the AS 
5' end of the duplex with one or more of the stability increasing modifications in the AS 3' 
end of the duplex. Accordingly a preferred embodiment includes modification in P.5 through 
P.i, more preferably P^ tiirough P.i and more preferably P.3 through P.i . Modification at P.i, 
is particularly preferred, alone or with other position, e.g., the positions just identified. It is 
preferred that at least 1, and more preferably 2, 3, 4, or 5 of the pairs of one of the recited 
regions of the AS 5' end of the duplex region be chosen independently from the group of: 

A:U 
G:U 
LC 
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mismatched pairs, e.g., non-canonical or other than canonical pairings which 
include a universal base; and 

a modification in P5 through Pi, more preferably P4 through Pi and more preferably 
P3 through Pi. Modification at Pi, is particularly preferred, alone or with other position, e.g., 
the positions just identified. It is preferred that at least 1, and more preferably 2, 3, 4, or 5 of 
the pairs of one of the recited regions of the AS 3' end of the duplex region be chosen 
independently from the group of: 

G:C 

a pah- having an analog that increases stability over Watson-Crick matches (A:T, 

A:U, G:C) 

2-amino-A:U 
2-thio U or 5 Me-thio-U:A 

G-clamp (an analog of C having 4 hydrogen bonds):G 

guanadinium-G-clamp:G 

psuedo uridine:A 

a pair in which one or both subunits has a sugar modification, e.g., a T 
modification, e.g., 2'F, ENA, or LNA, which enhance binding. 

The invention also mcludes methods of selecting and making iRNA agents havmg 
DMTDS. E.g., when screening a target sequence for candidate sequences for use as iRNA 
agents one can select sequences having a DMTDS property described herein or one which 
can be modified, preferably with as few changes as possible, especially to the 

AS strand, to provide a desired level of DMTDS. 

The invention also includes, providing a candidate iRNA agent sequence, and 
modifying at least one P in P.5 through P.j and/or at least one P in P5 through Pi to provide a 

DMTDS iRNA agent. 

DMTDS iRNA agents can be used in any method described herein, e.g., to silence 
any gene disclosed herein, to treat any disorder described herein, in any formulation 
described herein, and generally in and/or with the methods and compositions described 
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elsewhere herein, DMTDS iRNA agents can incorporate other modifications described 
herein, e.g., the attachment of targeting agents or the inclusion of modifications which 
enhance stability, e.g., the inclusion of nuclease resistant monomers or the inclusion of single 
strand overhangs (e.g., 3' AS overhangs and/or 3' S strand overhangs) which self associate to 
5 form intrastrand duplex structure. 

Preferably these iRNA agents will have an architecture described herem. 

Other Embodiments 

In vivo Delivery 

10 An iRNA agent can be linked, e.g., noncovalently linked to a polymer for the efficient 

delivery of the iRNA agent to a subject, e.g., a mammal, such as a human. The iRNA agent 
can, for example, be complexed with cyclodextrin. Cyclodextrins have been used as delivery 
vehicles of therapeutic compounds. Cyclodextrins can form inclusion complexes with drugs 
that are able to fit into the hydrophobic cavity of the cyclodextrin. In other examples, 

15 cyclodextrins form non-covalent associations with other biologically active molecules such 
as oligonucleotides and derivatives thereof The use of cyclodextrins creates a water-soluble 
drug delivery complex, that can be modified with targeting or other fimctional groups. 
Cyclodextrin cellular delivery system for oligonucleotides described in U.S. Pat. No. 
5,691,316, which is hereby incorporated by reference, are suitable for use in methods of the 

20 invention. In this system, an oligonucleotide is noncovalently complexed with a 

cyclodextrm, or the oligonucleotide is covalently bound to adamantine which in turn is non- 
covalently associated with a cyclodextrin. 

The delivery molecule can include a linear cyclodextrin copolymer or a linear 
oxidized cyclodextrin copolymer having at least one ligand bound to the cyclodextrin 

25 copolymer. Delivery systems , as described in U.S. Patent No. 6,509,323, herein 

incorporated by reference, are suitable for use in methods of the invention. An iRNA agent 
can be boimd to the linear cyclodextrin copolymer and/or a linear oxidized cyclodextrin 
copolymer. Either or both of the cyclodextrin or oxidized cyclodextrin copolymers can be 
crosslinked to another polymer and/or bound to a ligand. 

30 A composition for iRNA delivery can employ an "inclusion complex," a molecular 

compound having the characteristic structure of an adduct. In this structure, the "host 
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molecule" spatially encloses at least part of another compound in the delivery vehicle. The 
enclosed compound (the "guest molecule") is situated in the cavity of the host molecule 
without affecting the framework structure of the host. A "host" is preferably cyclodextrin, 
but can be any of the molecules suggested in U.S. Patent Publ. 2003/0008818, herein 

5 incorporated by reference. 

Cyclodextrins can interact with a variety of ionic and molecular species, and the 
resulting inclusion compounds belong to the class of "host-guest" complexes. Within the 
host-guest relationship, the binding sites of the host and guest molecules should be 
complementary in the stereoelectronic sense, A composition of the invention can contain at 

10 least one polymer and at least one therapeutic agent, generally in the form of a particulate 
composite of the polymer and therapeutic agent, e.g., the iRNA agent. The iRNA agent can 
contain one or more complexing agents. At least one polymer of the particulate composite 
can interact with the complexing agent in a host-guest or a guest-host interaction to form an 
inclusion complex between the polymer and the complexing agent. The polymer and, more 

1 5 particularly, the complexing agent can be used to introduce functionality into the 

composition. For example, at least one polymer of the particulate composite has host 
functionality and forms an inclusion complex with a complexing agent having guest 
functionality. Alternatively, at least one polymer of the particulate composite has guest 
functionality and forms an inclusion complex with a complexing agent having host 

20 functionality. A polymer of the particulate composite can also contain both host and guest 
functionalities and form inclusion complexes with guest complexing agents and host 
complexing agents. A polymer with functionality can, for example, facilitate cell targeting 
and/or cell contact (e.g., targeting or contact to a liver cell), intercellular trafficking, and/or 
cell entry and release. 

25 Upon forming the particulate composite, the iRNA agent may or may not retain its 

biological or therapeutic activity. Upon release from the therapeutic composition, 
specifically, from the polymer of the particulate composite, the activity of the iRNA agent is 
restored. Accordingly, the particulate composite advantageously affords the iRNA agent 
protection against loss of activity due to, for example, degradation and offers enhanced 

30 bioavailability. Thus, a composition may be used to provide stability, particularly storage or 
solution stability, to an iRNA agent or any active chemical compound. The iRNA agent may 
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be further modified with a ligand prior to or after particulate composite or therapeutic 
composition formation. The ligand can provide further functionality. For example, the 
ligand can be a targeting moiety. 

Physiological Effects 

The iRNA agents described herein can be designed such that determining therapeutic 
toxicity is made easier by the complementarity of the iRNA agent with both a human and a 
non-human animal sequence. By these methods, an iRNA agent can consist of a sequence 
that is fully complementary to a nucleic acid sequence fi:om a human and a nucleic acid 
sequence from at least one non-human animal, e.g,, a non-himian manmial, such as a rodent, 
ruminant or primate. For example, the non-human mammal can be a mouse, rat, dog, pig, 
goat, sheep, cow, monkey, Pan paniscus. Pan troglodytes, Macaca mulatto, or Cynomolgus 
monkey. The sequence of the iRNA agent could be complementary to sequences within 
homologous genes, e,g,, oncogenes or tumor suppressor genes, of the non-human mammal 
and the human. By determining the toxicity of the iRNA agent in the non-human mammal, 
one can extrapolate the toxicity of the iRNA agent in a human. For a more strenuous toxicity 
test, the iRNA agent can be complementary to a human and more than one, e.g., two or three 
or more, non-human animals. 

The methods described herein can be used to correlate any physiological effect of an iRNA 
agent on a human, e.g., any unwanted effect, such as a toxic effect, or any positive, or desired 
effect 

Delivery Module 

In one aspect, the invention features a drug delivery conjugate or module, such as 
those described herein and those described in copending, co-owned United States Provisional 
Application Serial No. 60/454,265, filed on March 12, 2003, which is hereby incorporated by 
reference. 

In addition, the invention mcludes iRNA agents described herein, e.g., a palindromic 
iRNA agent, an iRNA agent hving a non canonical pairing, an iRNA agent which targets a 
gene described herein, e.g., a gene active in the liver, an iRNA agent having a chemical 
modification described herein, e.g., a modification which enhances resistance to degradation, 
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an iRNA agent having an architecture or structure described herein, an iRNA agent 
administered as described herein, or an iRNA agent formulated as described herein, 
combined with, associated with, and delivered by such a drug delivery conjugate or module. 
The iRNA agents can be complexed to a delivery agent that features a modular 

5 complex. The complex can include a carrier agent linked to one or more of (preferably two 
or more, more preferably all three of): (a) a condensing agent {e.g., an agent capable of 
attracting, e.g., binding, a nucleic acid, e.g., through ionic or electrostatic interactions); (b) a 
fusogenic agent (e.g., an agent capable effusing and/or being transported through a cell 
membrane, e.g., an endosome membrane); and (c) a targeting group, e.g., a cell or tissue 

10 targeting agent, e.g., a lectin, glycoprotein, lipid or protein, e.g., an antibody, that binds to a 
specified cell type such as a cancer cell, endothelial cell or bone cell 

An iRNA agent, e.g., iRNA agent or sRNA agent described herein, can be linked, 
e.g., coupled or bound, to the modular complex. The iRNA agent can interact with the 
condensing agent of the complex, and the complex can be used to deliver an iRNA agent to a 

15 cell, e.g., in vitro or in vivo. For example, the complex can be used to deliver an iRNA agent 
to a subject in need thereof, e.g., to deliver an iRNA agent to a subject having a disorder, e.g., 
a disorder described herein, such as a disease or disorder of the liver. 

The fusogenic agent and the condensing agent can be different agents or the one and 
the same agent. For example, a polyamino chain, e.g., polyethyleneimine (PEI), can be tlie 

20 fusogenic and/or the condensing agent 

The delivery agent can be a modular complex. For example, the complex can include 
a carrier agent linked to one or more of (preferably two or more, more preferably all three 
of): 

(a) a condensing agent (e.g., an agent capable of attracting, e.g., binding, a nucleic 
25 acid, e.g., through ionic interaction), 

(b) a fusogenic agent (e.g., an agent capable of fusing and/or being transported 
through a cell membrane, e.g., an endosome membrane), and 

(c) a targeting group, e.g., a cell or tissue targeting agent, e.g., a lectin, glycoprotein, 
lipid or protein, e.g., an antibody, that bmds to a specified cell type such as a cancer cell, 

30 endothelial cell, bone cell. A targeting group can be a thyrotropin, melanotropin, lectin, 
glycoprotein, surfactant protein A, Mucin carbohydrate, multivalent lactose, multivalent 
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galactose, N-acetyl-galactosamine, N-acetyl-gulucosamine multivalent mannose, multivalent 
fucose, glycosylated polyaminoacids, multivalent galactose, transferrin, bisphosphonate, 
polyglutamate, polyaspartate, a lipid, cholesterol, a steroid, bile acid, folate, vitamin B 12, 
biotin, Neproxin, or an RGD peptide or RGD peptide mimetic. 

5 

Carrier agents 

The carrier agent of a modular complex described herein can be a substrate for 
attachment of one or more of: a condensing agent, a fusogenic agent, and a targeting group. 
The carrier agent would preferably lack an endogenous enzymatic activity. The agent would 

1 0 preferably be a biological molecule, preferably a macromolecule. Polymeric biological 
carriers are preferred. It would also be preferred that the carrier molecule be biodegradable.. 

The carrier agent can be a naturally occurring substance, such as a protein (e.g., 
human serum albumin (HSA), low-density lipoprotein (LDL), or globulin); carbohydrate 
(e.g., a dextran, pullulan, chitin, chitosan, inulin, cyclodextrin or hyaluronic acid); or lipid. 

15 The carrier molecule can also be a recombinant or synthetic molecule, such as a synthetic 
polymer, e.g., a synthetic polyamino acid. Examples of polyamino acids include polylysine 
(PLL), poly L-aspartic acid, poly L-glutamic acid, styrene-maleic acid anhydride copolymer, 
poly(L-lactide-co-glycolied) copolymer, divinyl ether-maleic anhydride copolymer, N-(2- 
hydroxypropyl)methacrylamide copolymer (HMPA), polyethylene glycol (PEG), polyvinyl 

20 alcohol (PVA), polyurethane, poly(2-ethylacryllic acid), N-isopropylacrylamide polymers, or 
polyphosphazuie. Other useful carrier molecules can be identified by routme methods. 

A carrier agent can be characterized by one or more of: (a) is at least 1 Da in size; (b) 
has at least 5 charged groups, preferably between 5 and 5000 charged groups; (c) is present 
m the complex at a ratio of at least 1:1 carrier agent to fusogenic agent; (d) is present in the 

25 complex at a ratio of at least 1 : 1 carrier agent to condensing agent; (e) is present in the 
complex at a ratio of at least 1 : 1 carrier agent to targeting agent. 



Ftisogenic agents 

A fusogenic agent of a modular complex described herein can be an agent that is 
30 responsive to, e.g., changes charge depending on, the pH environment. Upon encountering 
the pH of an endosome, it can cause a physical change, e.g., a change in osmotic properties 
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which disrupts or increases the permeability of the endosome memhrane. Preferably, the 
fusogenic agent changes charge, e.g., becomes protonated, at pH lower than physiological 
range. For example, the fusogenic agent can become protonated at pH 4.5-6,5. The 
fusogenic agent can serve to release the iKNA agent into the cytoplasm of a cell after the 
complex is taken up, e.g., via endocytosis, by the cell, thereby increasing the cellular 
concentration of the iRNA agent in tlie cell. 

In one embodunent, the fusogenic agent can have a moiety, e.g., an amino group, 
which, when exposed to a specified pH range, will undergo a change, e.g., in charge, e.g., 
protonation. The change in charge of the fusogenic agent can trigger a change, e.g., an 
osmotic change, in a vesicle, e.g., an endocytic vesicle, e.g., an endosome. For example, the 
fusogenic agent, upon being exposed to the pH environment of an endosome, will cause a 
solubility or osmotic change substantial enough to increase the porosity of (preferably , to 
rupture) the endosomal membrane. 

The fusogenic agent can be a polymer, preferably a polyamino chain, e.g., 
polyethyleneunine (PEI). The FBI can be linear, branched, synthetic or natural. The PEI can 
be, e.g., alkyl substituted PEI, or lipid substituted PEL 

In other embodiments, the fusogenic agent can be polyhistidine, polyimidazole, 
polypyridine, polypropyleneimine, melhtin, or a polyacetal substance, e.g., a cationic 
polyacetal. In some embodiment, the fusogenic agent can have an alpha helical structure. 
The fusogenic agent can be a membrane disruptive agent, e.g., mellittin. 

A fusogenic agent can have one or more of the following characteristics: (a) is at least 
IDa in size; (b) has at least 10 charged groups, preferably between 10 and 5000 charged 
groups, more preferably between 50 and 1000 charged groups; (c) is present in the complex 
at a ratio of at least 1 : 1 fusogenic agent to carrier agent; (d) is present ui the complex at a 
ratio of at least 1:1 fusogenic agent to condensing agent; (e) is present in the complex at a 
ratio of at least 1:1 fusogenic agent to targeting agent. 

Other suitable fusogenic agents can be tested and identified by a skilled artisan. The 
ability of a compound to respond to, e.g., change charge depending on, the pH environment 
can be tested by routine methods, e.g., in a cellular assay. For example, a test compound is 
combined or contacted with a cell, and the cell is allowed to take up tlie test compound, e.g., 
by endocytosis. An endosome preparation can then be made from the contacted cells and the 
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endosome preparation compared to an endosome preparation from control cells. A change, 
e.g., a decrease, in the endosome fraction from the contacted cell vs. the control cell indicates 
that the test compound can fimction as a fiisogenic agent. Alternatively, the contacted cell 
and control cell can be evaluated, e.g., by microscopy, e.g., by light or electron microscopy, 

5 to determine a difference in endosome population in the cells. The test compoimd can be 
labeled. In another type of assay, a modular complex described herein is constructed using 
one or more test or putative fiisogenic agents. The modular complex can be constructed 
using a labeled nucleic acid instead of the iRNA. The ability of the fiisogenic agent to 
respond to, e.g., change charge depending on, the pH environment, once the modular 

1 0 complex is taken up by the cell, can be evaluated, e.g., by preparation of an endosome 

preparation, or by microscopy techniques, as described above. A two-step assay can also be 
performed, wherein a first assay evaluates the ability of a test compoimd alone to respond to, 
e.g., change charge depending on, the pH environment; and a second assay evaluates the 
ability of a modular complex that includes the test compound to respond to, e.g., change 

1 5 charge depending on, the pH environment. 

Condensing agent 

The condensing agent of a modular complex described herein can interact with (e.g., 
attracts, holds, or binds to) an iRNA agent and act to (a) condense, e.g., reduce the size or 

20 charge of the iRNA agent and/or (b) protect the iRNA agent, e.g., protect the iRNA agent 
against degradation. The condensing agent can include a moiety, e.g., a charged moiety, that 
can interact with a nucleic acid, e.g., an iRNA agent, e.g., by ionic interactions. The 
condensing agent would preferably be a charged polymer, e.g., a polycationic chain. The 
condensing agent can be a polylysine (PLL), spermine, spermidine, polyamine, 

25 pseudopeptide-polyamine, peptidomimetic polyamine, dendrimer polyamine, arginine, 

amidine, protamine, cationic lipid, cationic porphyrin, quartemary salt of a polyamine, or an 
alpha helical peptide. 

A condensing agent can have the following characteristics: (a) at least IDa in size; (b) 
has at least 2 charged groups, preferably between 2 and 100 charged groups; (c) is present in 
30 the complex at a ratio of at least 1 : 1 condensing agent to carrier agent; (d) is present in the 
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complex at a ratio of at least 1:1 condensing agent to fusogenic agent; (e) is present in the 
complex at a ratio of at least 1:1 condensing agent to targeting agent. 

Other suitable condensing agents can be tested and identified by a skilled artisan, e.g., 
by evaluating the ability of a test agent to interact witii a nucleic acid, e.g., an iRNA agent. 
The ability of a test agent to interact with a nucleic acid, e.g., an iRNA agent, e.g., to 
condense or protect the iRNA agent, can be evaluated by routine techniques. In one assay, a 
test agent is contacted with a nucleic acid, and the size and/or charge of the contacted nucleic 
acid is evaluated by a technique suitable to detect changes in molecular mass and/or charge. 
Such techniques mclude non-denaturing gel electrophoresis, immunological methods, e.g., 
immunoprecipitation, gel filtration, ionic interaction chromatography, and the like. A test 
agent is identified as a condensing agent if it changes the mass and/or charge (preferably 
both) of the contacted nucleic acid, compared to a control. A two-step assay can also be 
performed, wherein a first assay evaluates the ability of a test compound alone to interact 
with, e.g., bind to, e.g., condense the charge and/or mass of, a nucleic cid; and a second assay 
evaluates the ability of a modular complex that uicludes the test compound to interact with, 
e.g., bind to, e.g., condense the charge and/or mass of, a nucleic acid. 

Amohipathic Delivery Agents 

In one aspect, the invention features an amphipathic delivery conjugate or module, 
such as those described herein and those described in copending, co-owned United States 
Provisional Application Serial No. 60/455,050 (Attorney Docket No. 14174-065P01), filed 
on March 13, 2003, which is hereby incorporated by reference. 

In addition, the invention include an iRNA agent described herein, e.g., a palindromic 
iRNA agent, an iRNA agent hving a non canonical pairing, an iRNA agent which targets a 
gene described herein, e.g., a gene active in the liver, an iRNA agent having a chemical 
modification described herein, e.g., a modification which enhances resistance to degradation, 
an iRNA agent having an architecture or structure described herein, an iRNA agent 
administered as described herein, or an iRNA agent formulated as described herein, 
combined with, associated with, and delivered by such an amphipathic delivery conjugate. 

An amphipathic molecule is a molecule having a hydrophobic and a hydrophilic 
region. Such molecules can interact with (e.g., penetrate or disrupt) lipids, e.g., a lipid 
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bylayer of a cell. As such, they can serve as delivery agent for an associated (e.g., bound) 
iRNA (e.g., an iRNA or sRNA described herem). A preferred amphipathic molecule to be 
used in the compositions described herein (e.g., the amphipathic iRNA constructs descriebd 
herein) is a polymer. The polymer may have a secondary structure, e.g., a repeatuig 

secondary structure. 

One example of an amphipathic polymer is an amphipathic polypeptide, e.g., a 
polypeptide having a secondary structure such that the polypeptide has a hydrophilic and a 
hybrophobic face. The design of amphipathic peptide structures (e.g., alpha-helical 
polypeptides) is routme to one of skill m the art For example, the following references 
provide guidance: Grell et al. (2001) Protein design and folding: template ti^apping of self- 
assembled helical bundles J Pept Sci 7(3): 146-51; Chen et al. (2002) Determination of 
stereochemistry stability coefficients of amino acid side-chains in an amphipathic alpha-helix 
J Pept Res 59(1): 18-33; Iwata et al. (1994) Design and synthesis of amphipathic 3(1 0)-helical 
peptides and their interactions with phospholipid bilayers and ion channel formation J Biol 
Chem 269(7):4928-33; Cornut et al. (1994) The amphipathic alpha-helix concept. 
Application to the de novo design of ideally amphipathic Leu, Lys peptides with hemolytic 
activity higher than that ofmelittin FEES Lett 349(l):29-33; Negrete et al. (1998) 
Deciphering the structural code for proteins: helical propensities in domain classes and 
statistical multiresidue information in alpha-helices. Protein Sci 7(6): 1368-79. 

Another example of an amphipathic polymer is a polymer made up of two or more 
amphipalhic subunits, e.g., two or more subunits containmg cycUc moieties (e.g., a cyclic 
moiety having one or more hydrophilic groups and one or more hydrophobic groups). For 
example, the subunit may contain a steroid, e.g., cholic acid; or a aromatic moiety. Such 
moieties preferably can exhibit atropisomerism, such that they can form opposing 
hydrophobic and hydrophilic faces when in a polymer structure. 

The ability of a putative amphipatliic molecule to interact with a lipid membrane, e.g., 
a cell membrane, can be tested by routine methods, e.g., in a cell free or cellular assay. For 
example, a test compound is combined or contacted with a synthetic lipid bilayer, a cellular 
membrane fraction, or a cell, and the test compound is evaluated for its ability to interact 
with, penetrate or disrupt the lipid bilayer, cell membrane or cell. The test compound can 
labeled in order to detect the interaction with the lipid bilayer, cell membrane or cell. In 
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another type of assay, the test compound is linked to a reporter molecule or an iRNA agent 
(e.g., an iRNA or sRNA described herein) and the ability of the reporter molecule or iRNA 
agent to penetrate the lipid bilayer, cell membrane or cell is evaluated. A two-step assay can 
also be performed, wherein a first assay evaluates the ability of a test compound alone to 

5 interact with a lipid bilayer, cell membrane or cell; and a second assay evaluates the ability of 
a construct (e.g., a construct described herein) that includes the test compound and a reporter 
or iRNA agent to interact with a lipid bilayer, cell membrane or cell. 

An amphipathic polymer useful in the compositions described herein has at least 2, 
preferably at least 5, more preferably at least 10, 25, 50, 100, 200, 500, 1000, 2000, 50000 or 

10 more subunits (e.g., amino acids or cyclic subunits). A single amphipathic polymer can be 
linked to one or more, e.g., 2, 3, 5, 10 or more iRNA agents (e.g., iRNA or sRNA agents 
described herein). Li some embodiments, an amphipathic polymer can contain both amino 
acid and cycUc subunits, e.g., aromatic subunits. 

The invention features a composition that includes an iRNA agent (e.g., an iRNA or 

1 5 sRNA described herem) in association with an amphipathic molecule. Such compositions 
may be referred to herein as "amphipathic iRNA constructs." Such compositions and 
constructs are useful in the delivery or targeting of iRNA agents, e.g., delivery or targeting 
of iRNA agents to a cell. While not wanting to be bound by theory, such compositions and 
constructs can increase the porosity of, e.g., can penetrate or disrupt, a lipid (e.g., a lipid 

20 bilayer of a cell), e.g., to allow entry of the iRNA agent into a cell. 

In one aspect, the invention relates to a composition comprising an iRNA agent (e.g., 
an iRNA or sRNA agent described herein) linked to an amphipathic molecule. The iRNA 
agent and the amphipathic molecule may be held in continuous contact with one another by 
either covalent or noncovalent linkages. 

25 The amphipathic molecule of the composition or construct is preferably other than a 

phospholipid, e.g., other than a micelle, membrane or membrane fragment. 

The amphipathic molecule of the composition or construct is preferably a polymer. 
The polymer may include two or more amphipathic subunits. One or more hydrophilic 
groups and one or more hydrophobic groups may be present on the polymer. The polymer 

30 may have a repeating secondary structure as well as a first face and a second face. The 
distribution of the hydrophilic groups and the hydrophobic groups along the repeating 
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secondary structure can be such that one face of the polymer is a hydrophilic face and the 
other face of the polymer is a hydrophobic face. 

The amphipathic molecule can be a polypeptide, e.g., a polypeptide comprising an 
a-helical conformation as its secondary structure. 

In one embodiment, the amphipathic polymer includes one or more subunits 
containing one or more cyclic moiety (e.g., a cyclic moiety having one or more hydrophilic 
groups and/or one or more hydrophobic groups). In one embodiment, the polymer is a 
polymer of cyclic moieties such that the moieties have alternating hydrophobic and 
hydrophiUc groups. For example, the subunit may contain a steroid, e.g., cholic acid. In 
another example, the subunit may contain an aromatic moiety. The aromatic moiety may be 
one that can exhibit atropisomerism, e.g., a 2,2'-bis(substituted)-l-l'.binaphthyl or a 2.2'- 
bis(substituted) biphenyl. A subunit may include an aromatic moiety of Formula (M): 





(M) 

The invention features a composition that includes an iKNA agent (e.g., an iRNA or 
sRNA described herein) in association with an amphipathic molecule. Such compositions 
may be referred to herein as "amphipathic iRNA constructs." Such compositions and 
constructs are useful in the delivery or targeting of iRNA agents, e.g., delivery or targeting 
of iRNA agents to a ceil. While not wanting to be bound by theory, such compositions and 
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constructs can increase the porosity of, e.g., can penetrate or disrupt, a lipid (e.g., a lipid 
bilayer of a ceil), e.g., to allow entry of the iRNA agent into a cell. 

In one aspect, the invention relates to a composition comprising an iRNA agent (e.g., 
an iRNA or sRNA agent described herein) linked to an amphipathic molecule. The iRNA 

5 agent and the amphipathic molecule may be held in continuous contact with one another by 
either covalent or noncovalent linkages. 

The amphipathic molecule of the composition or construct is preferably other than a 
phospholipid, e.g., other than a micelle, membrane or membrane fragment. 

The amphipathic molecule of the composition or construct is preferably a polymer. 

10 The polymer may include two or more amphipathic subunits. One or more hydrophilic 
groups and one or more hydrophobic groups may be present on the polymer. The polymer 
may have a repeating secondary structure as well as a first face and a second face. The 
distribution of the hydrophilic groups and the hydrophobic groups along the repeating 
secondary structure can be such that one face of the polymer is a hydrophilic face and the 

15 other face of the polymer is a hydrophobic face. 

The amphipathic molecule can be a polypeptide, e.g., a polypeptide comprising an 
a-helical conformation as its secondary structure. 

In one embodiment, the amphipathic polymer includes one or more subunits 
containing one or more cyclic moiety (e.g., a cyclic moiety having one or more hydrophilic 

20 groups and/or one or more hydrophobic groups). In one embodiment, the polymer is a 
polymer of cyclic moieties such that the moieties have alternating hydrophobic and 
hydrophilic groups. For example, the subunit may contam a steroid, e.g., cholic acid. In 
another example, the subunit may contain an aromatic moiety. The aromatic moiety may be 
one that can exhibit atropisomerism, e.g., a 2,2'"bis(substituted)-l-r-binaphthyl or a 2,2'- 

25 bis(substituted) biphenyl, A subunit may include an aromatic moiety of Formula (M): 
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ItMIC 




R3 



(M) 

5 

Referring to Formula M, K\ is Ci-Cioo alkyl optionally substituted with aryl, alkenyl, 
alkynyl, alkoxy or halo and/or optionally inserted with O, S, alkenyl or alkynyl; Cj-Cioo 

perfluoroalkyl; or OR5. 

R2 is hydroxy; nitro; sulfate; phosphate; phosphate ester; sulfonic acid; ORe; or Cp 
10 Cioo alkyl optionally substituted with hydroxy, halo, nitro, aryl or alkyl sulfinyl, aryl or alkyl 
sulfonyl, sulfate, sulfonic acid, phosphate, phosphate ester, substituted or unsubstituted aryl, 
carboxyl, carboxylate, amino carbonyl, or alkoxycarbonyl, and/or optionally inserted with O, 
NH, S, S(0), SO2, alkenyl, or alkynyl. 

R3 is hydrogen, or when taken together with R4 froms a fused phenyl ring. 
15 R4 is hydrogen, or when taken together with R3 froms a fused phenyl ring. 

R5 is Ci-Cioo alkyl optionally substituted with aryl, alkenyl, alkynyl, alkoxy or halo 
and/or optionally inserted with O, S, alkenyl or alkynyl; or CrCioo perfluoroalkyl; and Re is 
Ci-Cioo alkyl optionally substituted with hydroxy, halo, nitro, aryl or alkyl sulfinyl, aryl or 
alkyl sulfonyl, sulfate, sulfonic acid, phosphate, phosphate ester, substituted or unsubstituted 
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aryl, carboxyl, carboxylate, amino carbonyl, or alkoxycarbonyl, and/or optionally inserted 
with O, NH, S, S(0), SO2, alkenyl, or alkynyl. 

Increasing cellular uptake of dsRNAs 

A method of the invention that can include the administration of an iRNA agent and a 
drug that affects the uptake of the iRNA agent into the cell The dnig can be administered 
before, after, or at the same time that the iRNA agent is administered. The drug can be 
covalently Imked to the iRNA agent. The drug can be, for example, a lipopolysaccharide, an 
activator of p38 MAP kmase, or an activator of NF-kB. The drug can have a transient effect 
on the cell. 

The drug can mcrease the uptake of the iRNA agent into the cell, for example, by 
disruptmg the cell's cytoskeleton, e.g., by disrupting the celPs microtubules, microfilaments, 
and/or kitermediate filaments. The drug can be, for example, taxon, vincristine, vinblastine, 
cytochalasm, nocodazole, japlakinolide, latrunculin A, phalloidin, swinholide A, indanocine, 
or myoservin. 

The drug can also increase the uptake of the iRNA agent into the cell by activating an 
inflammatory response, for example. Exemplary drug's that would have such an effect 
include tumor necrosis factor alpha (TNFalpha), interleukin- 1 beta, or gamma mterferon. 

iRNA conjugates 

An iRNA agent can be coupled, e.g., covalently coupled, to a second agent. For 
example, an iRNA agent used to treat a particular disorder can be coupled to a second 
therapeutic agent, e.g., an agent other than the iRNA agent The second therapeutic agent 
can be one which is directed to the treatment of the same disorder. For example, in the case 
of an iRNA used to treat a disorder characterized by unwanted cell proliferation, e.g., cancer, 
the iRNA agent can be coupled to a second agent which has an anti-cancer effect. For 
example, it can be coupled to an agent which stimulates the immune system, e.g., a CpG 
motif, or more generally an agent that activates a tolUike receptor and/or increases the 
production of gamma interferon. 



170 



wo 2004/080406 



PCT/US2004/007070 



iRNA Production 

An iRNA can be produced, e.g., in bulk, by a variety of methods. Exemplary 
methods include: organic synthesis and KNA cleavage, e.g., in vitro cleavage. 

6 Organic Synthesis 

An iKNA can be made by separately synthesizing each respective strand of a double-* 
stranded RNA molecule. The component strands can then be annealed. 

A large bioreactor, e.g., the OligoPilot 11 jfrom Pharmacia Biotec AB (Uppsala 
Sweden), can be used to produce a large amount of a particular RNA strand for a given 

10 iRNA. The OligoPilotll reactor can efficiently couple a nucleotide using only a 1.5 molar 
excess of a phosphoramidite nucleotide. To make an RNA strand, ribonucleotides amidites 
are used. Standard cycles of monomer addition can be used to synthesize the 21 to 23 
nucleotide strand for the iRNA. Typically, the two complementary strands are produced 
separately and then annealed, e.g., after release from the solid support and deprotection. 

15 Oi^anic synthesis can be used to produce a discrete iRNA species. The 

complementary of the species to a particular target gene can be precisely specified. For 
example, the species may be complementary to a region that includes a polymorphism, e.g., a 
single nucleotide polymorphism. Further the location of the polymorphism can be precisely 
defined. In some embodiments, the polymorphism is located in an internal region, e.g., at 

20 least 4, 5, 7, or 9 nucleotides from one or both of the termini. 

dsRNA Cleavage 

iRNAs can also be made by cleaving a larger ds iRNA. The cleavage can be 
mediated in vitro or in vivo. For example, to produce iRNAs by cleavage in vitro, the 
following method can be used: 

25 In vitro transcription. dsRNA is produced by transcribing a nucleic acid (DNA) 

segment in both directions. For example, the HiScribe™ RNAi transcription kit (New 
England Biolabs) provides a vector and a method for producing a dsRNA for a nucleic acid 
segment that is cloned into the vector at a position flanked on either side by a T7 promoter. 
Separate templates are generated for T7 transcription of the two complementary strands for 

30 the dsRNA. The templates are transcribed in vitro by addition of T7 RNA polymerase and 
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dsRNA is produced. Similar methods using PGR and/or other RNA polymerases (e.g., T3 or 
SP6 polymerase) can also be used. In one embodunent, RNA generated by this method is 
carefully purified to remove endotoxins that may contaminate preparations of the 
recombinant enzymes. 

In vitiv cleavage. dsRNA is cleaved in vitro into iRNAs, for example, using a Dicer 
or comparable RNAse Ill-based activity. For example, the dsRNA can be incubated in an in 
vitro extract from Drosophila or using purified components, e.g. a purified RNAse or RISC 
complex (RNA-induced silencing complex ). See, e.g., Ketting et al Genes Dev 2001 Oct 
15;15(20):2654-9. and Hammond Science 2001 Aug 10;293(5532): 1146-50. 

dsRNA cleavage generally produces a plurality of iRNA species, each being a 
particular 21 to 23 nt fragment of a source dsRNA molecule. For example, iRNAs that 
include sequences complementary to overlapping regions and adjacent regions of a source 
dsRNA molecule may be present. 

Regardless of the method of synthesis, the iRNA preparation can be prepared in a 
solution (e.g., an aqueous and/or organic solution) that is appropriate for formulation. For 
example, the iRNA preparation can be precipitated and redissolved in pure double-distilled 
water, and lyophilized. The dried iRNA can then be resuspended in a solution appropriate for 
the intended formulation process. 

Synthesis of modified and nucleotide surrogate iRNA agents is discussed below. 

FORMULATION 

The iRNA agents described herein can be formulated for administration to a subject 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents, It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. 

A formulated iRNA composition can assume a variety of states. In some examples, 
the composition is at least partially crystalline, uniformly crystalline, and/or anhydrous (e.g., 
less than 80, 50, 30, 20, or 10% water). In another example, the iRNA is in an aqueous 
phase, e.g., in a solution that includes water. 

The aqueous phase or the crystalline compositions can, e.g., be incorporated into a 
delivery vehicle, e.g., a liposome (particularly for the aqueous phase) or a particle (e.g., a 
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microparticle as can be appropriate for a crystalline composition). Generally, the iRNA 
composition is formxilated in a manner that is compatible with the intended method of 
administration (see, below). 

In particular embodiments, the composition is prepared by at least one of the 
5 following methods: spray drying, lyophilization, vacuum drying, evaporation, fluid bed 
drying, or a combination of these techniques; or sonication with a lipid, freeze-drying, 
condensation and other self-assembly. 

A iRNA preparation can be formulated in combination with another agent, e.g., 
another therapeutic agent or an agent that stabilizes a iRNA, e.g., a protein that complexes 
10 with iRNA to form an iRNP, Still other agents include chelators, e.g., EDTA (e.g., to 
remove divalent cations such as Mg^^, salts, RNAse inhibitors (e.g., a broad specificity 
RNAse inhibitor such as RNAsin) and so forth. 

In one embodiment, the iRNA preparation includes another iRNA agent, e.g., a 
second iRNA that can mediated RNAi with respect to a second gene, or with respect to the 
15 same gene. Still other preparation can include at least 3, 5, ten, twenty, fifty, or a hundred or 
more different iRNA species. Such iRNAs can mediated RNAi with respect to a similar 
number of different genes. 

In one embodiment, the iRNA preparation includes at least a second therapeutic agent 
(e.g., an agent other than an RNA or a DNA). For example, a iRNA composition for the 
20 treatment of a viral disease, e.g. HIV, might include a known antiviral agent (e.g., a protease 
inhibitor or reverse transcriptase inhibitor). In another example, a iRNA composition for the 
treatment of a cancer might further comprise a chemotherapeutic agent. 

Exemplary formulations are discussed below: 

Liposomes 

25 For ease of exposition the formulations, compositions and methods in this section are 

discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA s agents, and such practice is within the invention. An iRNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA 

30 agent which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) preparation can be 
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formulated for delivery in a membranous molecular assembly, e.g., a liposome or a micelle. 
As used herein, the term "liposome" refers to a vesicle composed of amphiphilic lipids 
arranged in at least one bilayer, e.g., one bilayer or a plurality of bilayers. Liposomes include 
unilamellar and multilamellar vesicles that have a membrane formed from a lipophilic 

5 material and an aqueous interior. The aqueous portion contains the iRNA composition. The 
lipophilic material isolates the aqueous interior from an aqueous exterior, which typically 
does not include the iRNA composition, although in some examples, it may. Liposomes are 
useful for the transfer and delivery of active ingredients to the site of action. Because the 
liposomal membrane is structurally similar to biological membranes, when liposomes are 

1 0 applied to a tissue, the liposomal bilayer fuses with bilayer of the cellular membranes. As the 
merging of the liposome and cell progresses, the internal aqueous contents that include the 
iRNA are delivered into the cell where the iRNA can specifically bind to a target RNA and 
can mediate RNAi. In some cases the liposomes are also specifically targeted, e.g., to direct 
the iRNA to particular cell types. 

1 5 A liposome containing a iRNA can be prepared by a variety of methods. 

In one example, the lipid component of a liposome is dissolved in a detergent so that 
micelles are formed with the lipid component. For example, the lipid component can be an 
amphipathic cationic lipid or lipid conjugate. The detergent can have a high critical micelle 
concentration and may be nonionic. Exemplary detergents include cholate, CHAPS, 

20 octylglucoside, deoxycholate, and lauroyl sarcosine. The iRNA preparation is then added to 
the micelles that include the lipid component The cationic groups on the lipid interact with 
the iRNA and condense aroimd the iRNA to form a liposome. After condensation, the 
detergent is removed, e.g. , by dialysis, to yield a liposomal preparation of iRNA. 

If necessary a carrier compound that assists in condensation can be added during the 

25 condensation reaction, e.g., by controlled addition. For example, the carrier compound can 
be a polymer other than a nucleic acid (e.g., spermine or spermidine). pH can also adjusted 
to favor condensation. 

Further description of methods for producing stable polynucleotide delivery vehicles, 
which incorporate a polynucleotide/cationic lipid complex as structural components of the 

30 delivery vehicle, are described in, e.g., WO 96/37194. Liposome formation can also include 
one or more aspects of exemplary methods described in Feigner, P. L. et al, Froc, Natl 
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Acad Set, USA 8:7413-7417, 1987; U.S. Pat No. 4,897,355; U.S. Pat No. 5,171,678; 
Bangham, et al M Mol Biol 23:238, 1965; Olson, et al Biochim. Biophys. Acta 557:9, 
1979; Szoka, et al Proc. Natl Acad Sci. 75; 4194, 1978; Mayhew, et al Biochim. Biophys, 
Acta 775:169, 1984; Kim, et al Biochim. Biophys. Acta 728:339, 1983; and Fukimaga, et al 
6 Endocrinol 1 1 5:757, 1984. Commonly used techniques for preparing lipid aggregates of 
appropriate size for use as delivery vehicles include sonication and freeze-thaw plus 
extrusion (see, e.g., Mayer, et al Biochim. Biophys. Acta 858:161, 1986). Microfluidization 
can be used when consistently small (50 to 200 nm) and relatively uniform aggregates are 
desired (Mayhew, et al Biochim. Biophys. Acta 775:169, 1984). These methods are readily 

1 0 adapted to packaging iKNA preparations into liposomes. 

Liposomes that are pH-sensitive or negatively-charged, entrap nucleic acid molecules 
rather than complex with them. Since both the nucleic acid molecules and the lipid are 
similarly charged, repulsion rather than complex formation occurs. Nevertheless, some 
nucleic acid molecules are entrapped within the aqueous interior of these liposomes. pH- 

15 sensitive liposomes have been used to deliver DNA encoding the thymidine kinase gene to 
cell monolayers in culture. Expression of the exogenous gene was detected in the target cells 
(Zhou et al. Journal of Controlled Release, 19, (1992) 269-274). 

One major type of liposomal composition includes phospholipids other than 

« 

naturally-derived phosphatidylcholine. Neutral liposome compositions, for example, can be 
20 formed from dimyxistoyl phosphatidylcholine (DMPC) or dipahnitoyl phosphatidylcholine 
(DPPC). Anionic liposome compositions generally are formed from dimyristoyl 
phosphatidylglycerol, while anionic fusogenic liposomes are formed primarily from dioleoyl 
phosphatidylethanolamine (DOPE), Another type of liposomal composition is formed from 
phosphatidylcholine (PC) such as, for example, soybean PC, and egg PC. Another type is 
25 formed from mixtures of phospholipid and/or phosphatidylcholine and/or cholesterol. 

Examples of other methods to introduce liposomes into cells in vitro and in vivo 
include U.S. Pat No. 5,283,185; U.S. Pat No. 5,171,678; WO 94/00569; WO 93/24640; WO 
91/16024; Feigner, J. Biol Chem. 269:2550, 1994; Nabel, Proc. Natl Acad Sci. 90:11307, 
1993; Nabel, Human Gene Ther. 3:649, 1992; Gershon, Biochem. 32:7143, 1993; and Strauss 
30 EMBO J. 1992. 
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In one embodiment, cationic liposomes are used. Cationic liposomes possess the 
advantage of being able to fuse to the cell membrane. Non-cationic liposomes, although not 
able to fuse as efficiently with the plasma membrane, are taken up by macrophages in vivo 
and can be used to deliver iRNAs to macrophages. 
5 Further advantages of liposomes include: liposomes obtained from natural 

phospholipids are biocompatible and biodegradable; liposomes can incorporate a wide range 

o 

of water and lipid soluble drugs; liposomes can protect encapsulated iRNAs in their internal 
compartments from metabolism and degradation (Rosoff, in "Pharmaceutical Dosage 
Forms," Lieberman, Rieger and Banker (Eds.), 1988, volxraie 1, p. 245). Important 

10 considerations in the preparation of liposome formulations are the lipid surface charge, 
vesicle size and the aqueous volume of the liposomes. 

A positively charged synthetic cationic lipid, N-[l-(2,3-dioleyloxy)propyl]-N,N,N- 
trimethylammonium chloride (DOTMA) can be used to form small liposomes that interact 
spontaneously with nucleic acid to form lipid-nucleic acid complexes which are capable of 

15 fusing with the negatively charged lipids of the cell membranes of tissue culture cells, 
resulting in delivery of iRNA (see, e.g., Feigner, P. L. et aL^ Proc. Natl. Acad, Sci., USA 
8:7413.7417, 1987 and U.S. Pat. No. 4,897,355 for a description of DOTMA and its use with 
DNA), 

A DOTMA analogue, l,2-bis(oleoyloxy)-3-(trimethylammonia)propane (DOTAP) 
20 can be used in combination with a phospholipid to form DNA-complexing vesicles. 

Lipofectin™ Bethesda Research Laboratories, Gaithersburg, Md.) is an effective agent for 
the delivery of highly anionic nucleic acids mto living tissue culture cells that comprise 
positively charged DOTMA liposomes which interact spontaneously with negatively charged 
polynucleotides to form complexes. When enough positively charged liposomes are used, 
25 the net charge on the resulting complexes is also positive. Positively charged complexes 
prepared in this way spontaneously attach to negatively charged cell surfaces, fuse with the 
plasma membrane, and efficiently deliver functional nucleic acids into, for example, tissue 
culture cells. Another commercially available cationic lipid, l,2-bis(oleoyloxy)-3,3- 
(trimethylammonia)propane ("DOTAP") (Boehringer Mannheim, IndianapoUs, Indiana) 
30 differs from DOTMA in that the oleoyl moieties are linked by ester, rather than ether 
linkages. 
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Other reported cationic lipid compounds include those that have been conjugated to a 
variety of moieties including, for example, carboxyspemine which has been conjugated to 
one of two types of lipids and includes compounds such as 5-carboxyspermylglycine 
dioctaoleoylamide ("DOGS") (Transfectam™, Promega, Madison, Wisconsin) and 
5 dipalmitoylphosphatidyiethanolamine S-carboxyspermyl-amide ("DPPES") (see, e.g., U.S. 
Pat. No. 5,171,678). 

Another cationic lipid conjugate includes derivatization of the lipid with cholesterol 
("DC-Chol") which has been formulated into liposomes in combmation with DOPE (See, 
Gao, X. and Huang, L., Biochim. Biophys. Res. Commun. 179:280, 1991). Lipopolylysine, 

1 0 made by conjugating polylysine to DOPE, has been reported to be effective for transfection 
in the presence of serum (Zhou, X. et aL, Biochim. Biophys. Acta 1065:8, 1991). For certain 
cell lines, these liposomes containing conjugated cationic lipids, are said to exhibit lower 
toxicity and provide more efficient transfection than the DOTMA-containing compositions. 
Other commercially available cationic lipid products include DMRIE and DMRIE-HP 

1 5 (Vical, La JoUa, California) and Lipofectamme (DOSPA) (Life Technology, Inc., 

Gaithersburg, Maryland). Other cationic lipids suitable for the delivery of oligonucleotides 
are described in WO 98/39359 and WO 96/37194. 

Liposomal formulations are particularly suited for topical administration, liposomes 
present several advantages over other formulations. Such advantages include reduced side 

20 effects related to high systemic absorption of the administered drug, increased accumulation 
of the admmistered drug at the desired target, and the ability to administer iRNA, into the 
skin. In some implementations, liposomes are used for delivering iRNA to epidermal cells 
and also to enhance the penetration of iRNA into dermal tissues, e.g., into skin. For example, 
the liposomes can be applied topically. Topical delivery of drugs formulated as liposomes to 

25 the skin has been documented (see, e.g., Werner et al. Journal of Drug Targeting, 1992, vol. 
2,405-410 and du Plessis et al. Antiviral Research, 18, 1992, 259-265; Mannino, R. J. and 
Fould-Fogerite, S., Biotechniques 6:682-690, 1988; Itani, T. etal Gene 56:267-276. 1987; 
Nicolau, C. etal Meth. Enz. 149:157-176, 1987; Straubinger, R. M. andPapahadjopoulos, 
D. Meth. Enz. 101:512-527, 1983; Wang, C. Y. and Huang, L., Proc. Natl. Acad. Sci. USA 

30 84:7851-7855, 1987). 



177 



wo 2004/080406 PCT/US2004/007070 

Non-ionic liposomal systems have also been examined to determine their utility in the 
delivery of drugs to the skin, in particxilar systems comprising non-ionic surfactant and 
cholesterol. Non-ionic liposomal formulations comprising Novasome I (glyceryl 
dilaurate/cholesterol/polyoxyethylene-lO-stearyl ether) and Novasome II (glyceryl distearate/ 

5 cholesterol/polyoxyethylene-lO-stearyl ether) were used to deliver a drug into the dermis of 
mouse skin. Such formulations with iRNA are useful for treating a dermatological disorder. 

Liposomes that include iRNA can be made highly deformable. Such deformability 
can enable the liposomes to penetrate through pore that are smaller than the average radius of 
the liposome. For example, transfersomes are a type of deformable liposomes, 

10 Transferosomes can be made by adding surface edge activators, usually surfactants, to a 
standard liposomal composition. Transfersomes that include iRNA can be delivered, for 
example, subcutaneously by infection in order to deliver iRNA to keratinocytes in the skin. 
In order to cross intact mammalian skin, lipid vesicles must pass through a series of fine 
pores, each with a diameter less than 50 nm, under the influence of a suitable transdermal 

1 5 gradient. In addition, due to the lipid properties, these transferosomes can be self-optimizing 
(adaptive to the shape of pores, e.g., in the skin), self-repairing, and can frequently reach 
their targets without fragmentmg, and often self-loading. The iRNA agents can include an 
RRMS tethered to a moiety which improves association with a liposome. 

Surfactants 

20 For ease of exposition the formulations, compositions and methods in this section are 

discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. Surfactants find wide 
application in formulations such as emulsions (including microemulsions) and liposomes (see 

25 above). iRNA (or a precursor, e.g., a larger dsKNA which can be processed into a iRNA, or 
a DNA which encodes a iRNA or precursor) compositions can include a surfactant. In one 
embodiment, the iRNA is formxilated as an emulsion that includes a surfactant. The most 
common way of classifying and ranking the properties of the many different types of 
surfactants, both natural and synthetic, is by the use of the hydrophile/lipophile balance 

30 (HLB). The natm-e of the hydrophilic group provides the most useful means for categorizing 
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the different surfactants used in formulations (Rieger, in "Pharmaceutical Dosage Forms," 

Marcel Dekker, Inc., New York, NY, 1988, p. 285). 

If the surfactant molecule is not ionized, it is classified as a nonionic surfactant. 

Nonionic surfactants find wide application in pharmaceutical products and are usable over a 
5 wide range of pH values. In general their HLB values range firom 2 to about 1 8 depending 

on their structure. Nonionic surfactants include nonionic esters such as ethylene glycol 

esters, propylene glycol esters, glyceryl esters, polyglyceryl esters, sorbitan esters, sucrose 

esters, and ethoxylated esters. Nonionic alkanolamides and ethers such as fatty alcohol 

ethoxylates, propoxylated alcohols, and ethoxylated/propoxylated block polymers are also 
10 included in this class. The polyoxyethylene surfactants are the most popular members of the 

nonionic surfactant class. 

If the surfactant molecule carries a negative charge when it is dissolved or dispersed 

in water, the surfactant is classified as anionic. Anionic surfactants include carboxylates 

such as soaps, acyl lactylates, acyl amides of amino acids, esters of sulfuric acid such as alkyl 
15 sulfates and ethoxylated alkyl sulfates, sulfonates such as alkyl benzene sulfonates, acyl 

isethionates, acyl taurates and sulfosuccinates, and phosphates. The most important members 

of the anionic surfactant class are the alkyl sulfates and the soaps. 

If the surfactant molecule carries a positive charge when it is dissolved or dispersed in 

water, the surfactant is classified as cationic. Cationic surfactants include quaternary 
20 ammonium salts and ethoxylated amines. The quaternary ammoniimi saUs are the most used 

members of this class. 

* 

If the surfactant molecule has the ability to carry either a positive or negative charge, 
the siirfactant is classified as amphoteric. Amphoteric surfactants include acrylic acid 
derivatives, substituted alkylamides, N-alkylbetaines and phosphatides. 
25 The use of surfactants in drug products, formulations and in emulsions has been 

reviewed (Rieger, in "Pharmaceutical Dosage Forms," Marcel Dekker, Inc., New York, NY, 
1988, p. 285). 

Micelles and other Membranous Formulations 

For ease of exposition the micelles and other formulations, compositions and methods 
30 in this section are discussed largely with regard to unmodified iRNA agents. It should be 
understood, however, that these micelles and other formulations, compositions and methods 
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can be practiced with other iRNA agents, e.g., modified iRNA agents, and such practice is 
within the invention. The iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precxirsor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 

6 precursor thereof)) composition can be provided as a micellar formulation. "Micelles" are 
defined herein as a particular type of molecular assembly in which amphipathic molecules 
are arranged in a spherical structure such that all the hydrophobic portions of the molecules 
are directed inward, leaving the hydrophilic portions in contact with the surrounding aqueous 
phase. The converse arrangement exists if the environment is hydrophobic. 

1 0 A mixed micellar formulation suitable for delivery through transdermal membranes 

may be prepared by mixing an aqueous solution of the iRNA composition, an alkali metal Cg 
to C22 alkyl sulphate, and a micelle forming compoxmds. Exemplary micelle forming 
compounds include lecithin, hyaluronic acid, pharmaceutically acceptable salts of hyaluronic 
acid, glycolic acid, lactic acid, chamomile extract, cucumber extract, oleic acid, linoleic acid, 

15 linolenic acid, monoolein, monooleates, monolaurates, borage oil, evening of primrose oil, 
menthol, trihydroxy 0x0 cholanyl glycine and pharmaceutically acceptable salts thereof, 
glycerin, polyglycerin, lysine, polylysine, triolein, polyoxyethylene ethers and analogues 
thereof, polidocanol alkyl ethers and analogues thereof, chenodeoxycholate, deoxycholate, 
and mixtures thereof The micelle forming compounds may be added at the same time or 

20 afl:er addition of the alkali metal alkyl sulphate. Mixed micelles will form with substantially 
any kind of mixing of the ingredients but vigorous mixing is preferred in order to provide 
smaller size micelles. 

In one method a first micellar composition is prepared which contains the iRNA 
composition and at least the alkali metal alkyl sulphate. The first micellar composition is then 

25 mixed with at least three micelle forming compounds to form a mixed micellar composition. 
In another method, the micellar composition is prepared by mixing the iRNA composition, 
the alkali metal alkyl sulphate and at least one of the micelle forming compounds, followed 
by addition of the remaining micelle formmg compounds, with vigorous mixing. 

Phenol and/or m-cresol may be added to the mixed micellar composition to stabilize 

30 the formulation and protect against bacterial growth. Alternatively, phenol and/or m-cresol 
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may be added with the micelle forming mgredients. An isotonic agent such as glycerin may 
also be added after formation of the mixed micellar composition. 

For delivery of the micellar formulation as a spray, the formulation can be put uito an 
aerosol dispenser and the dispenser is charged with a propellant. The propellant, which is 

6 under pressure, is in liquid form in the dispenser. The ratios of the ingredients are adjusted 
so that the aqueous and propellant phases become one, te. there is one phase. If there are 
two phases, it is necessary to shake the dispenser prior to dispensing a portion of the 
contents, e.g. through a metered valve. The dispensed dose of pharmaceutical agent is 
propelled from the metered valve in a fine spray. 

1 0 The preferred propellants are hydrogen-containmg chlorofluorocarbons, hydrogen- 

containing fluorocarbons, dimethyl ether and diethyl ether. Even more preferred is HFA 134a 
(1,1,1,2 tetrafluoroethane). 

The specific concentrations of the essential ingredients can be determined by 
relatively straightforward experimentation. For absorption through the oral cavities, it is 

16 often desirable to increase, e.g. at least double or triple, the dosage for through injection or 
administration through the gastrointestinal tract. 

The iRNA agents can include an RRMS tethered to a moiety which improves 
association with a micelle or other membranous formulation. 

Particles 

20 For ease of exposition the particles, formulations, compositions and methods in this 

section are discussed largely with regard to unmodified iRNA agents. It should be 
understood, however, that these particles, formulations, compositions and methods can be 
practiced Avith other iRNA agents, e.g., modified iRNA agents, and such practice is within 
the invention. In another embodiment, an iRNA agent, e.g., a double-stranded iRNA agent, 

25 or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, or precursor thereof) preparations may be incorporated into a particle, e.g., a 
microparticle. Microparticles can be produced by spray-drying, but may also be produced by 
other methods including lyophilization, evaporation, fluid bed diying, vacuum drying, or a 

30 combmation of these techniques. See below for further description. 
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Sustained -Release Formulations. An iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof) described herein can be formulated for 
5 controlled, e.g., slow release. Controlled release can be achieved by disposing the iRNA 
within a structure or substance which impedes its release, E,g,, iRNA can be disposed within 
a porous matrix or in an erodable matrix, either of which allow release of the iRNA over a 
period of time. 

Polymeric particles, e.g., polymeric in microparticles can be used as a sustained- 
10 release reservoir of iRNA that is taken up by cells only released from the microparticle 
through biodegradation. The polymeric particles in this embodiment should therefore be 
large enough to preclude phagocytosis (e.g., larger than 10 (xm and preferably larger than 20 
jmi). Such particles can be produced by the same methods to make smaller particles, but with 
less vigorous mixmg of the &st and second emulsions. That is to say, a lower 
15 homogenization speed, vortex mixing speed, or sonication setting can be used to obtain 

particles having a diameter around 100 ^mi rather than 10 pm. The time of mixing also can be 
altered. 

Larger microparticles can be formulated as a suspension, a powder, or an implantable 
solid, to be delivered by intramuscular, subcutaneous, intradermal, intravenous, or 
20 intraperitoneal injection; via inhalation (intranasal or intrapulmonary); orally; or by 

implantation. These particles are useful for delivery of any iRNA when slow release over a 
relatively long term is desired. The rate of degradation, and consequently of release, varies 
with the polymeric formulation. 

Microparticles preferably include pores, voids, hollows, defects or other interstitial 
25 spaces that allow the fluid suspension medium to freely permeate or perfiise the particulate 
boundary. For example, the perforated microstructures can be used to form hollow, porous 
spray dried microspheres. 

Polymeric particles containing iRNA (e.g., a sRNA) can be made using a double 
emulsion technique, for instance. First, the polymer is dissolved in an organic solvent. A 
30 preferred polymer is polylactic-co-glycolic acid (PLGA), with a lactic/glycolic acid weight 
ratio of 65:35, 50:50, or 75:25. Next, a sample of nucleic acid suspended in aqueous solution 
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is added to fhe polymer solution and the two solutions are mixed to form a first emxilsion. 
The solutions can be mixed by vortexing or shaking, and in a preferred method, the mixture 
can be sonicated. Most preferable is any method by which the nucleic acid receives the least 
amount of damage in the form of nicking, shearing, or degradation, while still allowing the 

5 formation of an appropriate emulsion. For example, acceptable results can be obtained with a 
Vibra-cell model VC-250 sonicator with a 1/8" microtip probe, at setting #3. 

Spray-Drying. An iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 
DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 

10 precursor thereof)) can be prepared by spray drymg. Spray dried iRNA can be administered 
to a subject or be subjected to further formulation. A pharmaceutical composition of iRNA 
can be prepared by spray drying a homogeneous aqueous mixture that mcludes a iRNA under 
conditions sufficient to provide a dispersible powdered composition, e.g., a pharmaceutical 
composition. The material for spray drying can also include one or more of: a 

15 pharmaceutically acceptable excipient, or a dispersibility-enhancing amoxmt of a 
physiologically acceptable, water-soluble protein. The spray-dried product can be a 
dispersible powder that includes the iRNA. 

Spray drying is a process that converts a liquid or slurry material to a dried particulate 
form. Spray drying can be used to provide powdered material for various administrative 

20 routes including inhalation. See, for example, M. Sacchetti and M. M. Van Oort in: 

Inhalation Aerosols: Physical and Biological Basis for Therapy, A. J, Hickey, ed. Marcel 
Dekkar, New York, 1996. 

Spray drying can include atomizing a solution, emulsion, or suspension to form a fine 
mist of droplets and drying the droplets. The mist can be projected into a drying chamber 

25 (e.g., a vessel, tank, tubmg, or coil) where it contacts a drying gas. The mist can include 

solid or liquid pore forming agents. The solvent and pore forming agents evaporate from the 
droplets into the drying gas to solidify the droplets, simultaneously fonning pores throughout 
the solid. The sohd (typically in a powder, particulate form) then is separated from the drying 
gas and collected. 

30 Spray drying includes bringing together a highly dispersed Uquid, and a sufficient 

volume of air (e.g., hot air) to produce evaporation and drying of the liquid droplets. The 
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preparation to be spray dried can be any solution, course suspension, slurry, colloidal 
dispersion, or paste that may be atomized using the selected spray drying apparatus. 
Typically, the feed is sprayed into a current of warm filtered air that evaporates the solvent 
and conveys the dried product to a collector. The spent air is then exhausted with the solvent. 

5 Several different types of apparatus may be used to provide the desired product. For example, 
commercial spray dryers manufactured by Buchi Ltd. or Niro Corp. can effectively produce 
particles of desired size. 

Spray-dried powdered particles can be approximately spherical in shape, nearly 
uniform in size and frequently hollow. There may be some degree of irregularity in shape 

10 depending upon the incorporated medicament and the spray drying conditions. In many 

instances the dispersion stability of spray-dried microspheres appears to be more effective if 
an inflating agent (or blowing agent) is used in their production. Particularly preferred 
embodiments may comprise an emulsion with an inflating agent as the disperse or continuous 
phase (the other phase being aqueous in nature). An inflating agent is preferably dispersed 

16 with a surfactant solution, using, for instance, a commercially available microfluidizer at a 
pressure of about 5000 to 15,000 psi. This process forms an emulsion, preferably stabiUzed 
by an incorporated surfactant, typically comprising submicron droplets of water immiscible 
blowing agent dispersed in an aqueous continuous phase. The formation of such dispersions 
using this and other techniques ai*e common and well known to those in the ait. The blowing 

20 agent is preferably a fluorinated compound {e.g. perfluorohexane, perfluorooctyl bromide, 
perfluorodecalin, perfluorobutyl ethane) which vaporizes during the spray-drying process, 
leaving behind generally hollow, porous aerodynamically light microspheres. As will be 
discussed m more detail below, other suitable blowing agents include chloroform, freons, and 
hydrocarbons. Nitrogen gas and carbon dioxide are also contemplated as a suitable blowing 

25 agent. 

Although the perforated microstructures are preferably formed using a blowing agent 
as described above, it will be appreciated that, in some instances, no blowing agent is 
required and an aqueous dispersion of the medicament and surfactant(s) are spray dried 
directly. In such cases, the formulation may be amenable to process conditions (e.g., elevated 
30 temperatures) that generally lead to the formation of hollow, relatively porous microparticles. 
Moreover, the medicament may possess special physicochemical properties (e.g., high 
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crystallinity, elevated melting temperature, surface activity, etc.) that make it particularly 
suitable for use in such techniques. 

The perforated microstructures may optionally be associated with, or comprise, one 
or more surfactants. Moreover, miscible surfactants may optionally be combined with the 

5 suspension medium liquid phase. It will be appreciated by those skilled in the art that the use 
of surfactants may further increase dispersion stability, simplify formulation procedures or 
increase bioavailability upon administration. Of course combinations of surfactants, 
including the use of one or more in the liquid phase and one or more associated with the 
perforated microstructures are contemplated as being within the scope of the invention. By 

1 0 "associated with or comprise" it is meant that the structural matrix or perforated 
microstructure may incorporate, adsorb, absorb, be coated with or be formed by the 
surfactant. 

Surfactants suitable for use include any compound or composition that aids in the 
formation and maintenance of the stabilized respiratory dispersions by forming a layer at the 

15 interface between the structural matrix and the suspension medium. The surfactant may 
comprise a single compound or any combination of compounds, such as in the case of co- 
surfactants. Particularly preferred surfactants are substantially insoluble in the propellant, 
nonfluorinated, and selected from the group consisting of saturated and imsaturated lipids, 
nonionic detergents, nonionic block copolymers, ionic surfactants, and combinations of such 

20 agents. It should be emphasized that, in addition to the aforementioned surfactants, suitable 
(i.e. biocompatible) fluorinated sxirfactants are compatible with the teachings herein and may 
be used to provide the desired stabilized preparations. 

Lipids, including phospholipids, from both natural and synthetic sources may be used 
in varying concentrations to form a structural matrix. Generally, compatible lipids comprise 

25 those that have a gel to liquid ciystal phase transition greater than about 40** C. Preferably, 
the incorporated lipids are relatively long chain (/.e. Ce -C22) saturated lipids and more 
preferably comprise phospholipids. Exemplary phospholipids useful in the disclosed 
stabilized preparations comprise egg phosphatidylcholine, dilauroylphosphatidylcholine, 
dioleylphosphatidylcholine, dipalmitoylphosphatidyl-choline, disteroylphosphatidylcholine, 

30 short-chain phosphatidylcholines, phosphatidylethanolamine, 

dioleylphosphatidylethanolamine, phosphatidylserine, phosphatidylglycerol. 
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phosphatidylinositol, glycolipids, ganglioside GMl, sphingomyelin, phosphatidic acid, 
cardiolipin; lipids bearing polymer chains such as, polyethylene glycol, chitin, hyaluronic 

* 

acid, or polyvinylpyrrolidone; lipids bearing sulfonated mono-, di-, and polysaccharides; 
fatty acids such as palmitic acid, stearic acid, and oleic acid; cholesterol, cholesterol esters, 

5 and cholesterol hemisuccinate. Due to their excellent biocompatibility characteristics, 
phospholipids and combinations of phospholipids and poloxamers are particularly suitable 
for use in the stabilized dispersions disclosed herein. 

Compatible nonionic detergents comprise: sorbitan esters including sorbitan trioleate 
(Spans™ 85), sorbitan sesquioleate, sorbitan monooleate, sorbitan monolaurate, 

10 polyoxyethylene (20) sorbitan monolaurate, and polyoxyethylene (20) sorbitan monooleate, 
oleyl polyoxyethylene (2) ether, stearyl polyoxyethylene (2) ether, laxuyl polyoxyethylene (4) 
ether, glycerol esters, and sucrose esters. Other suitable nonionic detergents can be easily 
identified using McCutcheon's Emulsifiers and Detergents (McPublishing Co., Glen Rock, 
N. J.). Preferred block copolymers include diblock and triblock copolymers of 

15 polyoxyethylene and polyoxypropylene, including poloxamer 188 (Pluronic.RTM. F68), 

poloxamer 407 (Pluronic.RTM. F-127), and poloxamer 338. Ionic surfactants such as sodium 
sulfosuccinate, and fatty acid soaps may also be utilized, hi preferred embodiments, the 
microstructures may comprise oleic acid or its alkali salt. 

In addition to the aforementioned surfactants, cationic surfactants or lipids are 

20 preferred especially in the case of delivery of an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger IRNA agent which can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof). Examples of suitable cationic lipids include: 
DOTMA, N-[-(2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium-chloride; D0TAP,1,2- 

25 dioleyloxy-3-(trimethylammonio)propane; and DOTB, l,2-dioleyl-3-(4'- 

trimethylanmionio)butanoyl-sn-glycerol. Polycationic amino acids such as polylysine, and 
polyarginine are also contemplated. 

For the spraying process, such spraying methods as rotary atomization, pressure 
atomization and two-fluid atomization can be used. Examples of the devices used in these 

30 processes uiclude "Parubisu [phonetic rendering] Mini-Spray GA-32" and "Parubisu Spray 
Drier DL-4r', manufactured by Yamato Chemical Co., or "Spray Drier CL-8," *'Spray Drier 
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L-8," "Spray Drier FL-12," "Spray Drier FL-16" or "Spray Drier FL-20," manufactured by 
Okawara Kakoki Co., can be used for the method of spraymg using rotary-disk atomizer. 

While no partic\ilar restrictions are placed on the gas used to dry the sprayed material, 
it is recommended to use air, nitrogen gas or an inert gas. The temperature of the inlet of the 

5 gas used to dry the sprayed materials such that it does not cause heat deactivation of the 
sprayed material. The range of temperatures may vary between about 50°C to about 200°C, 
preferably between about SO^'C and 100°C. The temperature of the outlet gas used to dry the 
sprayed material, may vary between about 0°C and about 1 50°C, preferably between O^'C and 
90°C, and even more preferably between 0°C and 60°C. 

10 The spray drying is done under conditions that result in substantially amorphous 

powder of homogeneous constitution having a particle size that is respirable, a low moisture 
content and flow characteristics that allow for ready aerosolization. Preferably the particle 
size of the resulting powder is such that more than about 98% of the mass is in particles 
having a diameter of about 10 [un or less with about 90% of the mass being in particles 

15 having a diameter less than 5 \Jim. Alternatively, about 95% of the mass will have particles 
with a diameter of less than 10 |uun with about 80% of the mass of the particles having a 

diameter of less than 5 jun. 

The dispersible pharmaceutical-based dry powders that mclude the iRNA preparation 
may optionally be combined with pharmaceutical carriers or excipients which are suitable for 

20 respiratory and pulmonary administration. Such carriers may serve simply as bulking agents 
when it is desired to reduce the iRNA concentration in the powder which is being delivered 
to a patient, but may also serve to enhance the stability of the iRNA compositions and to 
improve the dispersibility of the powder within a powder dispersion device in order to 
provide more efficient and reproducible delivery of the iRNA and to improve handlmg 

25 characteristics of the iRNA such as flowability and consistency to facilitate manufacturing 
and powder filling. 

Such carrier materials may be combined with the drug prior to spray drying, i.e., by 
adding the carrier material to the purified bulk solution. In that way, the carrier particles will 
be formed simultaneously with the drug particles to produce a homogeneous powder. 
30 Alternatively, the carriers may be separately prepared in a dry powder form and combined 
with the dry powder drug by blending. The powder carriers will usually be crystalline (to 
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avoid water absorption), but might in some cases be amorphous or mixtures of crystalline 
and amorphous. The size of the carrier particles may be selected to improve the flowability of 
the drug powder, typically being in the range firom 25 nm to 100 (im. A preferred carrier 
material is crystalline lactose having a size in the above-stated range. 

5 Powders prepared by any of the above methods will be collected from the spray dryer 

in a conventional manner for subsequent use. For use as pharmaceuticals and other purposes, 
it will frequently be desirable to disrupt any agglomerates which may have formed by 
screening or other conventional techniques. For pharmaceutical uses, the dry powder 
formulations will usually be measured into a single dose, and the single dose sealed into a 

1 0 package. Such packages are particularly useful for dispersion in dry powder inhalers, as 
described in detail below. Alternatively, the powders may be packaged in multiple-dose 
cont£uners. 

Methods for spray drying hydrophobic and other drugs and components are described 
in U.S. Pat. Nos. 5,000,888; 5,026,550; 4,670,419, 4,540,602; and 4,486,435. Bloch and 

16 Speison (1983) Pharm. Acta Helv 58:14-22 teaches spray drying of hydrochlorothiazide and 
chlorthalidone (lipophilic drugs) and a hydrophilic adjuvant (pentaerythritol) m azeotropic 
solvents of dioxane-water and 2-ethoxyethanol-water. A number of Japanese Patent 
application Abstracts relate to spray drying of hydrophilic-hydrophobic product 
combinations, including JP 806766; JP 7242568; JP 7101884; JP 7101883; JP 71018982; JP 

20 7 1 0 1 88 1 ; and JP 4036233 . Other foreign patent publications relevant to spray dryuig 
hydrophilic-hydrophobic product combmations include FR 2594693; DE 2209477; and 
WO 88/07870. 

LYOPHILIZATION . 

25 An iKNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 

precursor, e.g., a larger iKNA agent which can be profiessed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) preparation can be made by lyophilization. Lyophilization is a freeze- 
drying process in which water is sublimed from the composition after it is frozen. The 

30 particular advantage associated with the lyophilization process is that biologicals and 
pharmaceuticals that are relatively unstable in an aqueous solution can be dried wdthout 
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elevated temperatures (thereby eliminating the adverse thermal effects), and then stored in a 
dry state where there are few stability problems. With respect to the instant invention such 
techniques are particularly compatible with the incorporation of nucleic acids in perforated 
microstructures without compromising physiological activity. Methods for providmg 

5 lyophilized particulates are known to those of skill in the art and it would clearly not require 
undue experimentation to provide dispersion compatible microstructures in accordance with 
the teachings herein. Accordmgly, to the extent that lyophilization processes may be used to 
provide microstructures having the desired porosity and size, they are conformance with the 
teachings herein and are expressly contemplated as being within the scope of the instant 

10 invention. 

TargetinR 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNAs. It should be understood, however, that 
these formulations, compositions and methods can be practiced with other iRNA agents, e.g., 

15 modified iRNA agents, and such practice is within the invention. 

In some embodiments, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA 
agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA 
agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or 
sRNA agent, or precursor thereof) is targeted to a particular cell. For example, a liposome 

20 or particle or other structure that includes a iRNA can also mclude a targeting moiety that 
recognizes a specific molecule on a target cell. The targeting moiety can be a molecule with 
a specific affmity for a target cell. Targeting moieties can include antibodies directed against 
a protein found on the surface of a target cell, or the ligand or a receptor-binding portion of a 
ligand for a molecule found on the surface of a target cell. For example, the targeting moiety 

25 can recognize a cancer-specific antigen (e.g., CA15-3, CA19-9, CEA, or HER2/neu.) or a 
viral antigen, thus delivering the iRNA to a cancer cell or a vnus-infected cell. Exemplary 
targeting moieties include antibodies (such as IgM, IgG, IgA, IgD, and the like, or a 
functional portions thereof), ligands for cell surface receptors (e.g., ectodomains thereof). 
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Table 3 provides a number of antigens which can be used to target selected cells. 



Table 3. 



ANTIGEN 



Exemplary tumor tissue 



CEA (carcinoerabryonic antigen) 

PSA 0>rostate specific antigen) 

CA-125 

CA15-3 

CA 19-9 

HER2/neu 

a-feto protein 

P-HCG (human chorionic gonadotropin) 
MUC-1 

Estrogen receptor 

Progesterone receptor 

EGFr (epidermal growth factor receptor) 



colon, breast, limg 
prostate cancer 
ovarian cancer 
breast cancer 
breast cancer 
breast cancer 

testicular cancer, hepatic cancer 
testicular cancer, choriocarcinoma 
breast cancer 

breast cancer, uterine cancer 
breast cancer, uterine cancer 
bladder cancer 



In one embodiment, the targeting moiety is attached to a liposome. For example, US 
6,245,427 describes a method for targeting a liposome using a protein or peptide. In another 
example, a cationic lipid component of the liposome is derivatized vwth a targeting moiety. 
For example, WO 96/37194 describes converting N-glutaryldioleoylphosphatidyl 
10 ethanolamine to a N-hydroxysuccinimide activated ester. The product was then coupled to 
an RGD peptide. 

GENES AND DISEASES 

In one aspect, the invention features, a method of treating a subject at risk for or 
afflicted with unwanted cell proliferation, e.g., malignant or nonmalignant cell proliferation. 

1 5 The method includes: 

providing an iRNA agent, e.g., an sRNA or iRNA agent described herein, e.g., an 
iRNA having a structure described herein, where the iRNA is homologous to and can silence, 
c^g-y by cleavage, a gene which promotes unwanted cell proliferation; 

administering an iRNA agent, e.g., an sRNA or iRNA agent described herein to a 
20 subject, preferably a human subject, 

thereby treating the subject. 
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In a preferred embodiment the gene is a growth factor or growth factor receptor gene, 
a kinase, e.g., a protein tyrosine, serine or threonine kinase gene, an adaptor protein gene, a 
gene encoding a G protein superfamily molecxile, or a gene encoding a transcription factor. 

In a prefened embodiment the iRNA agent silences the PDGF beta gene, and thus can 
5 be used to treat a subject having or at risk for a disorder characterized by unwanted PDGF 
beta expression, e.g., testicular and lung cancers. 

In another preferred embodiment the iRNA agent silences the Erb-B gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted Erb- 
B expression, e.g,, breast cancer. 
10 In a preferred embodiment the iRNA agent silences the Src gene, and thus can be 

used to treat a subject having or at risk for a disorder characterized by unwanted Src 
expression, e.g., colon cancers. 

In a preferred embodiment the iRNA agent silences the CRK gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted CRK 
15 expression, e.g., colon and lung cancers. 

In a preferred embodiment the iRNA agent silences the GRB2 gene^ and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted GRB2 
expressipn, e.g., squamous cell carcinoma. 

In another preferred embodiment the iRNA agent silences the RAS gene, and thus can 
20 be used to treat a subject having or at risk for a disorder characterized by unwanted RAS 
expression, e.g., pancreatic, colon and lung cancers, and chronic leukemia. 

In another preferred embodiment the iRNA agent silences the MEKK gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
MEKK expression, e.g., squamous cell carcinoma, melanoma or leukemia. 
25 In another preferred embodiment the iRNA agent silences the INK gene, and thus can 

be used to treat a subject having or at risk for a disorder characterized by vmwanted JNK 
expression, e.g., pancreatic or breast cancers. 

In a preferred embodiment the iRNA agent silences the RAF gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted RAF 
30 expression, e.g., lung cancer or leukemia. 
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In a preferred embodiment the iRNA agent silences the Erkl/2 gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted Erkl/2 
expression, e.g., lung cancer. 

In another preferred embodiment the iRNA agent silences the PCNA(p21) gene, and 
5 thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
PCNA expression, e.g., limg cancer. 

In a preferred embodiment the iRNA agent silences the MYB gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted MYB 
expression, e.g., colon cancer or chronic myelogenous leukemia. 
10 In a preferred embodiment the iRNA agent silences the c-MYC gene, and thus can bp 

used to treat a subject having or at risk for a disorder characterized by unwanted c-MYC 
expression, e.g., Burkitt's lymphoma or neuroblastoma. 

In another preferred embodiment the iRNA agent silences the JUN gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted JUN 
15 expression, e.g., ovarian, prostate or breast cancers. 

In another preferred embodiment the iRNA agent silences the FOS gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted FOS 
expression, e.g., skin or prostate cancers. 

In a preferred embodiment the iRNA agent silences the BCL-2 gene, and thus can be 
20 used to treat a subject having or at risk for a disorder characterized by unwanted BCL-2 
expression, e.g., lung or prostate cancers or Non-Hodgkin lymphoma. 

In a preferred embodiment the iRNA agent silences the Cyclin D gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted Cyclin D 
expression, e.g., esophageal and colon cancers. 
25 In a preferred embodiment the iRNA agent silences the VEGF gene, and thus can be 

used to treat a subject having or at risk for a disorder characterized by unwanted VEGF 
expression, e.g., esophageal and colon cancers. 

In a preferred embodiment the iRNA agent silences the EGFR gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by imwanted EGFR 
30 expression, e.g., breast cancer. 
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In another preferred embodiment the iRNA agent silences the Cyclin A gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
Cyclin A expression, e.g., lung and cervical cancers. 

In another preferred embodiment the iRNA agent silences the Cyclin E gene, and thus 
5 can be used to treat a subject having or at risk for a disorder characterized by unwanted 
Cyclin E expression, e.g., lung and breast cancers. 

In another preferred embodiment the iRNA agent silences the WNT-1 gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted 
WNT-1 expression, e.g., basal cell carcinoma. 
10 In another preferred embodiment the iRNA agent silences the beta-catenin gene, and 

thus can be used to treat a subject having or at risk for a disorder characterized by imwanted 
beta-catenin expression, e.g., adenocarcinoma or hepatocellular carcinoma. 

In another preferred embodunent the iRNA agent silences the c-MET gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted c- 
1 5 MET expression, e.g., hepatocellular carcmoma. 

In another preferred embodiment the iRNA agent silences the PKC gene, and thus can 
be used to treat a subject having or at risk for a disorder characterized by unwanted PKC 
expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences the NFKB gene, and thus can be 
20 used to treat a subject having or at risk for a disorder diaracterized by unwanted NFKB 
expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences the STAT3 gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted STAT3 
expression, e.g., prostate cancer. 
25 In another preferred embodiment the iRNA agent silences the survivin gene, and thus 

can be used to treat a subject having or at risk for a disorder characterized by unwanted 
survivin expression, e.g., cervical or pancreatic cancers. 

In another preferred embodiment the iRNA agent silences the Her2/Neu gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
30 Her2/Neu expression, e.g., breast cancer. 
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In another preferred embodiment the iRNA agent silences the topoisomerase I gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted topoisomerase I expression, e.g., ovarian and colon cancers. 

In a preferred embodiment the iRNA agent silences the topoisomerase 11 alpha gene, 
5 and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted topoisomerase II expression, e.g., breast and colon cancers. 

In a preferred embodiment the iRNA agent silences mutations in the p73 gene, and 
thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
p73 expression, e.g., colorectal adenocarcinoma. 
10 In a preferred embodiment the iRNA agent silences mutations in the 

p21(WAFl/CIPl) gene, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted p21(WAFl/CIPl) expression, e.g., liver cancer. 

In a preferred embodiment the iRNA agent silences mutations in the p27(KIPl) gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
1 5 imwanted p27(KIP 1 ) expression, e.g., liver cancer. 

In a preferred embodiment the iRNA agent silences mutations in the PPMID gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted PPMID expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences mutations in the RAS gene, and 
20 thus can be used to treat a subject having or at risk for a disorder characterized by unwanted 
RAS expression, e.g., breast cancer. 

In another preferred embodiment the iRNA agent silences mutations in the caveolin I 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted caveolin I expression, e.g., esophageal squamous cell carcinoma. 
25 In another preferred embodiment the iRNA agent silences mutations in the MIB I 

gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted MIB I expression, e.g., male breast carcinoma (MBC). 

In another preferred embodiment the iRNA agent silences mutations in the MTAI 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
30 unwanted MTAI expression, e.g., ovarian carcinoma. 
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In another preferred embodiment the iRNA agent silences mutations in the M68 gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted M68 expression, e.g., human adenocarcinomas of the esophagus, stomach, colon, 
and rectum. 

5 In preferred embodiments the iRNA agent silences mutations in tumor suppressor 

genes, and thus can be used as a method to promote apoptotic activity in combination with 
chemotherapeutics. 

In a preferred embodiment the iRNA agent silences mutations in the p5 3 tumor 
suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
10 characterized by unwanted p53 expression, e.g., gall bladder, pancreatic and lung cancers. 

In a preferred embodiment the iRNA agent silences mutations in the p53 family 
member DN-p63, and thus can be used to treat a subject having or at risk for a disorder 
characterized by unwanted DN-p63 expression, e.g., squamous cell carcinoma 

In a preferred embodiment the iRNA agent silences mutations m the pRb tumor 
15 suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 

■ 

characterized by unwanted pRb expression, e.g., oral squamous cell carcinoma 

In a preferred embodiment the iRNA agent silences mutations in the APCl tumor 

suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 

characterized by unwanted APCl expression, e.g., colon cancer. 
20 In a preferred embodiment the iRNA agent silences mutations in the BRCAl tumor 

suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 

characterized by unwanted BRCAl expression, e.g., breast cancer. 

In a preferred embodiment the iRNA agent silences mutations in the PTEN tumor 

suppressor gene, and thus can be used to treat a subject having or at risk for a disorder 
25 characterized by unwanted PTEN expression, e.g., hamartomas, gliomas, and prostate and 

endometrial cancers. 

In a preferred embodiment the iRNA agent silences MLL fusion genes, e.g., MLL- 

AF9, and thus can be used to treat a subject having or at risk for a disorder characterized by 

unwanted MLL fusion gene expression, e.g., acute leukemias. 
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In another preferred embodiment the iRNA agent silences the BCR/ABL fiision gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted BCR/ABL fusion gene expression, e.g., acute and chronic leulceraias. 

In another preferred embodiment the iRNA agent silences the TEL/AMLl fusion 
5 gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted TEL/AMLl fusion gene expression, e.g., childhood acute leukemia. 

In another preferred embodiment the iRNA agent silences the EWS/FLIl fusion gene, 
and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted EWS/FLIl fusion gene expression, e.g., Ewing Sarcoma. 
10 In another preferred embodiment the iRNA agent silences the TLS/FUSl fusion gene, 

and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted TLS/FUSl fusion gene expression, e.g.. Myxoid liposarcoma. 

In another preferred embodiment the iRNA agent silences the PAX3/FKHR fusion 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
1 5 unwanted PAX3/FKHR fusion gene expression, e.g.. Myxoid liposarcoma. 

In another preferred embodiment the iRNA agent silences the AMLl/ETO fusion 
gene, and thus can be used to treat a subject having or at risk for a disorder characterized by 
unwanted AMLl/ETO fusion gene expression, e.g., acute leukemia. 

In another aspect, the invention features, a method of treatmg a subject, e.g., a human, 
20 at risk for or afflicted with a disease or disorder that may benefit by angiogenesis inhibition 
e.g., cancer. The method includes: 

providing an iRNA agent, e.g., an iRNA agent having a structure described herein, 
which iRNA agent is homologous to and can silence, e.g., by cleavage, a gene which 
mediates angiogenesis; 
25 administering the iRNA agent to a subject, 

thereby treating the subject. 

In a preferred embodiment the iRNA agent silences the alpha v-integrin gene, and 
thus can be used to treat a subject havmg or at risk for a disorder characterized by unwanted 
alpha V integrin, e.g., brain tumors or tumors of epithelial origin. 
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In a preferred embodiment the iRNA agent silences the Flt-1 receptor gene, and thus 
can be used to treat a subject having or at risk for a disorder characterized by unwanted Flt-1 
receptors, eg. Cancer and rheumatoid arthritis. 

In a preferred embodiment the iRNA agent silences the tubulin gene, and thus can be 
5 used to treat a subject having or at risk for a disorder characterized by unwanted tubulin, eg. 
Cancer and retinal neovascularization. 

In a preferred embodiment the iRNA agent silences the tubulin gene, and thus can be 
used to treat a subject having or at risk for a disorder characterized by unwanted tubulin, eg. 
Cancer and retinal neovascularization. 
10 In another aspect, the invention features a method of treating a subject infected with a 

virus or at risk for or afQicted with a disorder or disease associated with a viral infection. 
The method includes: 

providing an iRNA agent, e.g., and iRNA agent having a structure described herein, 
which iRNA agent is homologous to and can silence, e.g., by cleavage, a viral gene of a 
15 cellular gene which mediates viral function, e.g., entry or growth; 

administering the iRNA agent to a subject, preferably a human subject, 

thereby treating the subject. 

Thus, the invention provides for a method of treating patients infected by the Human 
Papilloma Virus (HPV) or at risk for or afflicted with a disorder mediated by HPV, e.g, 
20 cervical cancer. HPV is linked to 95% of cervical carcinomas and thus an antiviral therapy is 
an attractive method to treat these cancers and other symptoms of viral infection. 

In a preferred embodiment, the expression of a HPV gene is reduced. In another 
preferred embodiment, the HPV gene is one of the group of E2, E6, or E7. 

In a preferred embodiment the expression of a human gene that is required for HPV 
25 replication is reduced. 

The invention also includes a method of treating patients infected by the Human 
Immunodeficiency Virus (HIV) or at risk for or afflicted with a disorder mediated by HIV, 
e.g.. Acquired Immune Deficiency Syndrome (AIDS). 

In a preferred embodiment, the expression of a HIV gene is reduced. In another 
30 preferred embodiment, the HTV gene is CCR5, Gag, or Rev, 
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In a preferred embodiment the expression of a human gene that is required for HIV 
replication is reduced. In another preferred embodiment, the gene is CD4 or TsglOl, 

The invention also includes a method for treating patients infected by the Hepatitis B 
Virus (HBV) or at risk for or afflicted with a disorder mediated by HBV, e.g., cirrhosis and 
5 heptocellular carcinoma. 

In a preferred embodiment, the expression of a HBV gene is reduced. In another 
preferred embodiment, the targeted HBV gene encodes one of the group of the tail region of 
the HBV core protein, the pre-cregious (pre-c) region, or the cregious (c) region. In another 
preferred embodiment, a targeted HB V-RNA sequence is comprised of the poly(A) tail. 
10 In preferred embodiment the expression of a human gene that is required for HBV 

replication is reduced. 

The invention also provides for a method of treating patients infected by the Hepatitis 
A Virus (HAV), or at risk for or afflicted with a disorder mediated by HAV, 

In a preferred embodiment the expression of a human gene that is required for HAV 
1 5 replication is reduced. 

The present invention provides for a method of treating patients infected by the 
Hepatitis C Virus (HCV), or at risk for or afflicted with a disorder mediated by HCV, e.g,, 
cirrhosis 

In a preferred embodiment, the expression of a HCV gene is reduced. 
20 In another preferred embodiment the expression of a human gene that is required for 

HCV replication is reduced. 

The present invention also provides for a method of treating patients infected by the 
any of the group of Hepatitis Viral strains comprismg hepatitis D, E, F, G, or H, or patients at 
risk for or afflicted with a disorder mediated by any of these strains of hepatitis. 
25 In a preferred embodiment, the expression of a Hepatitis, D, E, F, G, or H gene is 

reduced. 

In another preferred embodiment the expression of a human gene that is required for 
hepatitis D, E, F, G or H replication is reduced. 

Methods of the invention also provide for treating patients infected by the 
30 Respiratory Syncytial Virus (RSV) or at risk for or afflicted with a disorder mediated by 
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RSV, e,g, lower respiratory tract infection in infants and childhood asthma, pneumonia and 
other complications, e.g., in the elderly. 

In a preferred embodiment, the expression of a RSV gene is reduced. In another 
preferred embodiment, the targeted HBV gene encodes one of the group of genes N, L, or P. 
5 In a preferred embodiment the expression of a human gene that is required for RSV 

replication is reduced. 

Methods of the invention provide for treating patients infected by the Herpes 
Simplex Virus (HSV) or at risk for or afflicted with a disorder mediated by HSV, e.g, genital 
herpes and cold sores as well as life-threatening or sight-impairing disease mainly in 
10 immunocompromised patients. 

In a preferred embodiment, the expression of a HSV gene is reduced. In another 
preferred embodiment, the targeted HSV gene encodes DNA polymerase or the helicase- 
primase. 

In a preferred embodiment the expression of a human gene that is required for HSV 
15 replication is reduced. 

The invention also provides a method for treating patients infected by the herpes 
Cytomegalovirus (CMV) or at risk for or afflicted with a disorder mediated by CMV, e.g., 
congenital virus infections and morbidity in immunocompromised patients. 
In a preferred embodiment, the expression of a CMV gene is reduced. 
20 In a preferred embodiment the expression of a human gene that is required for CMV 

replication is reduced. 

Methods of the invention also provide for a method of treating patients infected by 
the herpes Epstein Barr Virus (EBV) or at risk for or afflicted with a disorder mediated by 
EBV, e.g., NK/T-cell lymphoma, non-Hodgkin lymphoma, and Hodgkin disease. 
25 In a preferred embodiment, the expression of a EBV gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for EBV 
replication is reduced. 

Methods of the invention also provide for treating patients infected by Kaposi*s 
Sarcoma-associated Herpes Virus (KSHV), also called human herpesvirus 8, or patients at 
30 risk for or afflicted with a disorder mediated by KSHV, e.g., Kaposi's sarcoma, multicentric 
Castleman's disease and AIDS-associated primary effusion lymphoma. 
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In a preferred embodiment, the e5q)ression of a KSHV gene is reduced. 
In a preferred embodiment the expression of a human gene that is required for KSHV 
replication is reduced. 

The mvention also includes a method for treating patients infected by the JC Virus 
(JCV) or a disease or disorder associated with this virus, e.g., progressive multifocal 
leukoencephalopathy (PML). 

In a preferred embodiment, the expression of a JCV gene is reduced. 

In preferred embodiment the expression of a human gene that is required for JCV 
replication is reduced. 

Methods of the invention also provide for treating patients infected by the myxovirus 
or at risk for or afflicted with a disorder mediated by myxovirus, e,g.y influenza. 

In a preferred embodiment, the expression of a myxovirus gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for 
myxovirus replication is reduced. 

Methods of the invention also provide for treating patients infected by the rhinovirus 
or at' risk for of afflicted with a disorder mediated by rhinovirus, e.g., the coramon cold. 

In a preferred embodiment, the expression of a rhinovirus gene is reduced. 

In preferred embodiment the expression of a human gene that is required for 
rhinovirus replication is reduced. 

Methods of the invention also provide for treating patients infected by the coronavirus 
or at risk for of afflicted with a disorder mediated by coronavuiis, e.g., the common cold. 

In a preferred embodiment, the expression of a coronavirus gene is reduced. 

In preferred embodiment the expression of a human gene that is required for 
coronavirus replication is reduced. 

Methods of the invention also provide for treating patients infected by the flavivirus 
West Nile or at risk for or afflicted with a disorder mediated by West Nile Virus. 

In a preferred embodiment, the expression of a West Nile Virus gene is reduced. In 
another preferred embodiment, the West Nile Virus gene is one of the group comprising E, 
NS3, orNSS. 

In a preferred embodiment the expression of a human gene that is required for West 
Nile Vifxis replication is reduced. 
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Methods of the invention also provide for treating patients infected by the St. Louis 
Encephalitis flavivirus, or at risk for or afflicted with a disease or disorder associated with 
this virus, e.g., viral haemorrhagic fever or neurological disease. 

In a preferred embodiment, the expression of a St Louis Encephalitis gene is reduced. 
5 In a preferred embodiment the expression of a human gene that is required for St. 

Louis Encephalitis virus replication is reduced. 

Methods of the invention also provide for treating patients infected by the Tick-borne 
encephalitis flavivirus, or at risk for or afflicted with a disorder mediated by Tick-borne 
encephalitis virus, e.g., viral haemorrhagic fever and neurological disease. 
10 In a preferred embodiment, the expression of a Tick-borne encephalitis virus gene is 

reduced. 

In a preferred embodiment the expression of a human gene that is required for Tick- 
borne encephalitis virus replication is reduced. 

Methods of the invention also provide for methods of treating patients infected by the 
15 Murray Valley encephalitis flavivirus, which commonly resuUs in viral haemorrhagic fever 
and neurological disease. 

In a preferred embodiment, the expression of a Murray Valley encephalitis virus gene 
is reduced. 

In a prefened embodiment the expression of a human gene that is required for Murray 
20 Valley encephalitis virus replication is reduced. 

The invention also includes methods for treating patients infected by the dengue 
flavivirus, or a disease or disorder associated with this virus, e.g., dengue haemorrhagic 
fever. 

In a preferred embodiment, the expression of a dengue virus gene is reduced. 
25 In a preferred embodiment the expression of a human gene that is required for dengue 

virus replication is reduced. 

Methods of the invention also provide for treating patients infected by the Simian 
Virus 40 (SV40) or at risk for or afflicted with a disorder mediated by SV40, e,g., 
tumorigenesis. 

30 In a preferred embodiment, the expression of a SV40 gene is reduced. 
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In a preferred embodiment the expression of a human gene that is required for SV40 
replication is reduced. 

The invention also includes methods for treating patients infected by the Human T 
Cell Lymphotropic Virus (HTLV), or a disease or disorder associated with this virus, e.g., 
5 leukemia and myelopathy. 

In a preferred embodiment, the expression of a HTLV gene is reduced. In another 
preferred embodiment the HTLVl gene is the Tax transcriptional activator. 

In a preferred embodiment the expression of a human gene that is required for HTLV 
replication is reduced. 

10 Methods of the invention also provide for treating patients infected by the Moloney- 

Murine Leukemia Virus (Mo-MuL V) or at risk for or afflicted with a disorder mediated by 
Mo-MuLV, e.g.y T-cell leukemia. 

In a preferred embodiment, the expression of a Mo-MuLV gene is reduced. 
In a preferred embodiment the expression of a human gene that is required for Mo- 
15 MuLV replication is reduced. 

Methods of the invention also provide for treating patients infected by the 
encephalomyocarditis virus (EMCV) or at risk for or afflicted with a disorder mediated by 
EMCV, e.g. myocarditis. EMCV leads to myocarditis in mice and pigs and is capable of 
infecting human myocardial cells. This virus is therefore a concern for patients undergoing 
20 xenotransplantation. 

In a preferred embodiment, the expression of a EMCV gene is reduced. 
In a preferred embodiment the expression of a human gene that is required for EMCV 
replication is reduced. 

The invention also includes a method for treating patients mfected by the measles 
25 vims (MV) or at risk for or afflicted with a disorder mediated by MV, e.g. measles. 

In a preferred embodiment, the expression of a MV gene is reduced. 
In a preferred embodiment the expression of a human gene that is required for MV 
replication is reduced. 

The invention also includes a method for treating patients infected by the Vericella 
30 zoster virus (VZV) or at risk for or afflicted with a disorder mediated by VZV, e.g. chicken 
pox or shingles (also called zoster). 
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In a preferred embodiment, the expression of a VZV gene is reduced. 
In a preferred embodiment the expression of a hmnan gene that is required for VZV 
replication is reduced. 

The invention also includes a method for treating patients infected by an adenovirus 
or at risk for or afflicted with a disorder mediated by an adenovirus, e.g. respiratory tract 
infection. 

In a preferred embodiment, the expression of an adenovirus gene is reduced. 
In a preferred embodiment the expression of a human gene that is reqmred for 
adenovirus replication is reduced. 

The mvention includes a method for treating patients infected by a yellow fever vkus 
(YFV) or at risk for or afflicted with a disorder mediated by a YFV, e.g. respiratory tract 
infection. 

In a preferred embodiment, the expression of a YFV gene is reduced. In another 
preferred embodiment, the preferred gene is one of a group that includes the E, NS2A, or 
NS3 genes. 

In a preferred embodiment the expression of a human gene that is required for YFV 
replication is reduced. 

Methods of the invention also provide for treating patients mfected by the poliovirus 
or at risk for or afflicted with a disorder mediated by poliovirus, e.g., polio. 

In a preferred embodiment, the expression of a polio virus gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for 
poliovirus replication is reduced. 

Methods of the invention also provide for treatmg patients infected by a poxvirus or 
at risk for or afflicted with a disorder mediated by a poxvirus, e.g., smallpox 

In a preferred embodiment, the expression of a poxvirus gene is reduced. 

In a preferred embodiment the expression of a human gene that is required for 
poxvirus replication is reduced. 

In another, aspect the invention features methods of treating a subject infected with a 
pathogen, e.g., a bacterial, amoebic, parasitic, or fungal pathogen. The method includes: 

providing a iRNA agent, e.g., a siRNA having a structure described herein, where 
siRNA is homologous to and can silence, e.g., by cleavage of a pathogen gene; 
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administering the iKNA agent to a subject, prefereably a human subject, 
thereby treating the subject. 

The target gene can be one involved in growth, cell wall synthesis, protein synthesis, 
transcription, energy metabolism, e.g., the Krebs cycle, or toxin production. 
5 Thus, the present invention provides for a method of treating patients infected by a 

Plasmodium that causes malaria. 

In a preferred embodiment, the expression of a Plasmodium gene is reduced. In 
another preferred embodiment, the gene is apical membrane antigen 1 (AMAl). 

In a preferred embodiment the expression of a human gene that is required for 
10 Plasmodium replication is reduced. 

The invention also includes methods for treating patients infected by the 
Mycobacterium ulcerans, or a disease or disorder associated with this pathogen, e,g. Buruli 
ulcers. 

In a preferred embodiment, the expression of a Mycobacterium ulcerans gene is 
16 reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Mycobacterium ulcerans replication is reduced. 

The invention also includes methods for treating patients infected by the 
Mycobacterium tuberculosis, or a disease or disorder associated with this pathogen, e.g. 
20 tuberculosis. 

In a preferred embodiment, the expression of a Mycobacterium tuberculosis gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Mycobacterium tuberculosis replication is reduced. 
25 The invention also includes methods for treating patients infected by the 

Mycobacteriimi leprae, or a disease or disorder associated with this pathogen, e.g. leprosy. 

« 

In a preferred embodiment, the expression of a Mycobacterium leprae gene is 
reduced. 

In a preferred embodiment the expression of a human gene that is required for 
30 Mycobacterium leprae replication is reduced. 
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The invention also includes methods for treating patients infected by the bacteria 
Staphylococcus aureus, or a disease or disorder associated with this pathogen, e.g. infections 
of the skin and muscous membranes. 

In a preferred embodiment, the expression of a Staphylococcus aureus gene is 
5 reduced. 

In a preferred embodiment the expression of a human gene that is required for 
Staphylococcus aureus replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 
Streptococcus pneumoniae, or a disease or disorder associated with this pathogen, e.g. 
10 pneumonia or childhood lower respiratory tract infection. 

In a preferred embodiment, the expression of a Streptococcus pneumoniae gene is 
reduced. 

In a prefened embodiment the expression of a human gene that is required for 
Streptococcus pneumoniae replication is reduced, 
15 The invention also includes methods for treating patients infected by the bacteria 

Streptococcus pyogenes, or a disease or disorder associated with this pathogen, e.g. Strep 
throat or Scarlet fever. 

In a preferred embodiment, the expression of a Streptococcus pyogenes gene is 
reduced. 

20 In a preferred embodiment the expression of a human gene that is required for 

Streptococcus pyogenes replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 

Chlamydia pneumoniae, or a disease or disorder associated with this pathogen, e.g. 

pneumonia or childhood lower respiratory tract infection 
25 In a preferred embodiment, the expression of a Chlamydia pneumoniae gene is 

reduced. 

In a preferred embodiment the expression of a human gene that is reqwred for 
Chlamydia pneumoniae replication is reduced. 

The invention also includes methods for treating patients infected by the bacteria 
30 Mycoplasma pneumoniae, or a disease or disorder associated with this pathogen, e.g. 
pneumonia or childhood lower respiratory tract infection 
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In a preferred embodiment, the expression of a Mycoplasma pneumoniae gene is 
reduced. 

In a preferred embodiment the expression of a human gene fhsA is required for 
Mycoplasma pneumoniae replication is reduced. 

In one aspect, the invention features, a method of treating a subject, e.g., a human, at 
risk for or afflicted with a disease or disorder characterized by an unwanted immune 
response, e.g., an inflammatory disease or disorder, or an autoimmune disease or disorder. 
The method includes: 

providing an iRNA agent, e.g., an iRNA agent having a structure described herein, 
which iRNA agent is homologous to and can silence, e.g., by cleavage, a gene which 
mediates an unwanted immune response; 

administering the iRNA agent to a subject, 

thereby treating the subject. 

In a preferred embodiment the disease or disorder is an ischemia or reperfusion 
injury, e.g., ischemia or reperfusion injury associated with acute myocardial infarction, 
unstable angina, cardiopulmonary bypass, surgical intervention e.g., angioplasty, e.g., 
percutaneous transluminal coronary angioplasty, the response to a transplantated organ or 
tissue, e.g., tran^lanted cardiac or vascular tissue; or thrombolysis. 

In a preferred embodiment the disease or disorder is restenosis, e.g., restenosis 
associated with surgical intervention e.g., angioplasty, e.g., percutaneous transluminal 
coronary angioplasty. 

In a prefered embodiment the disease or disorder is Inflammatory Bowel Disease, 
e.g., Crohn Disease or Ulcerative Colitis. 

In a prefered embodiment the disease or disorder is inflammation associated with an 
infection or injury. 

In a prefered embodunent the disease or disorder is asthma, lupus, multiple sclerosis, 
diabetes, e.g., type U diabetes, arthritis, e.g., rheumatoid or psoriatic. 

In particularly preferred embodiments the iRNA agent silences an integrin or co- 
ligand thereof, e.g., VLA4, VCAM, ICAM. 

In particularly preferred embodiments the iRNA agent silences a selectin or co-ligand 
thereof, e.g., P-selectin, E-selectin (ELAM), I-selectin, P-selectin glycoprotein- 1 (PSGL-1). 
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In particularly preferred embodiments the iRNA agent silences a component of the 
complement system, e.g., C3, C5, C3aR, C5aR, C3 convertase, C5 convertase. 

In particularly preferred embodiments the iRNA agent silences a chemokine or 
receptor thereof, e.g., TNFI, TNFJ, IL-II, IL-1 J, IL -2, IL-2R, IL-4, IL-4R, IL-5, IL-6, IL-8, 
5 TNFRI, TOFRII, IgE, SCYAl 1, CCR3. 

In other embodiments the iRNA agent silences GCSF, Grol, Gro2, Gro3, PF4, MIG, 
Pro-Platelet Basic Protein (PPBP), MIP-II, MIP-1 J, RANTES, MCP-1, MCP-2, MCP-3, 
CMBKRl, CMBKR2, CMBKR3, CMBIGRS, AIF-1, 1-309. 

In one aspect, the invention features, a method of treating a subject, e.g., a human, at 
10 risk for or afflicted with acute pain or chronic pain. The method includes: 

providing an iRNA agent, which iRNA is homologous to and can silence, e.g., by 
cleavage, a gene which mediates the processing of pain; 

administering the iRNA to a subject, 

thereby treating the subject. 
15 In particularly preferred embodiments the iRNA agent silences a component of an ion 

channel. 

In particularly preferred embodiments the iRNA agent silences a neurotransmitter 
receptor or ligand. 

In one aspect, the invention features, a method of treating a subject, e.g., a human, at 
20 risk for or afflicted with a neurological disease or disorder. The method includes: 

providing an iRNA agent which iRNA is homologous to and can silence, e.g., by 
cleavage, a gene which mediates a neurological disease or disorder; 
administering the to a subject, 
thereby treating the subject. 
25 In a prefered embodiment the disease or disorder is Alzheimer Disease or Parkinson 

Disease. 

In particularly preferred embodiments the iRNA agent silences an amyloid-family 
gene, e.g., APP; a presenilin gene, e.g., PSENl and PSEN2, or I-synuclein. 

In a preferred embodiment the disease or disorder is a neurodegenerative trinucleotide 
30 repeat disorder, e.g., Huntington disease, dentatorubral pallidoluysian atrophy or a 

spinocerebellar ataxia, e.g., SCAl, SCA2, SCA3 (Machado-Joseph disease), SCA7 or SCA8. 
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In particularly preferred embodiments the iKNA agent silences HD, DRPLA, SCAl, SCA2, 
MJDl, CACNL1A4, SCA7, SCA8. 

The loss of heterozygosity (LOH) can result in hemizygosity for sequence, e.g., 
genes, in the area of LOH. This can result in a significant genetic difference between normal 
and disease-state cells, e.g., cancer ceils, and provides a useful difference between normal 
and disease-state cells, e.g., cancer cells. This difference can arise because a gene or other 
sequence is heterozygous in euploid cells but is hemizygous in cells having LOH. The 
regions of LOH will often include a gene, the loss of which promotes unwanted proliferation, 
e.g., a tumor suppressor gene, and other sequences including, e.g., other genes, in some cases 
a gene which is essential for normal function, e.g., growth. Methods of the invention rely, m 
part, on the specific cleavage or silencing of one allele of an essential gene with an iRNA 
agent of the invention. The iRNA agent is selected such that it targets the single allele of the 
essential gene found in the cells having LOH but does not silence the other allele, which is . 
present in cells which do not show LOH. In essence, it discriminates between the two 
alleles, preferentially silencing the selected allele. In essence polymorphisms, e.g., SNPs of 
essential genes that are affected by LOH, are used as a target for a disorder characterized by 
cells having LOH, e.g., cancer cells having LOH. 

Kg., one of ordinary skill in the art can identify essential genes which are in 
proximity to tumor suppressor genes, and which are within a LOH region which includes the 
tumor suppressor gene. The gene encoding the large subunit of human RNA polymerase II, 
P0LR2A, a gene located in close proximity to the tumor suppressor gene p53, is such a gene. 
It frequently occurs within a region of LOH in cancer cells. Other genes that occur within 
LOH regions and are lost in many cancer cell types include the group comprising replication 
protein A 70-kDa subunit, repUcation protein A 32-kD, ribonucleotide reductase, thymidilate 
synthase, TATA associated factor 2H, ribosomal protein S14, eukaryotic initiation factor 5A, 
alanyl tRNA synthetase, cysteinyl tRNA synthetase, NaK ATPase, alpha-1 subunit, and 
transferrin receptor. 

Accordingly, the invention features, a method of treating a disorder characterized by 
LOH, e.g., cancer. The method includes: 

optionally, determining the genotype of the allele of a gene in the region of LOH and 
preferably determining the genotype of both alleles of the gene in a normal cell; 
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providing an iRNA agent which preferentially cleaves or silences the allele found in 
the LOH cells; 

administeming the iRNA to the subject, 

thereby treating the disorder. 
6 The invention also includes a iRNA agent disclosed herein, e.g, an iRNA agent which 

can preferentially silence, e.g., cleave, one allele of a polymorphic gene 

In another aspect, the invention provides a method of cleaving or silencing more than 
one gene with an iRNA agent. In these embodiments the iRNA agent is selected so that it 
has sufficient homology to a sequence found in more than one gene. For example, the 
10 sequence AAGCTGGCCCTGGACATGGAGAT (SEQ ID NO:6736) is conserved between 
mouse lamin Bl, lamin B2, keratin complex 2-gene 1 and lamin A/C. Thus an iRNA agent 
targeted to this sequence would effectively silence the entire collection of genes. 

The invention also includes an iRNA agent disclosed herein, which can silence more 
than one gene. 

15 ROUTE OF DELIVERY 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formiilations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. A composition that 

20 includes a iRNA can be delivered to a subject by a variety of routes. Exemplary routes 
include: intravenous, topical, rectal, anal, vaginal, nasal, pulmonary, ocular. 

The iRNA molecules of the invention can be incorporated into pharmaceutical 
compositions suitable for administration. Such compositions typically include one or more 
species of iRNA and a pharmaceutically acceptable carrier. As used herein the language 

25 "pharmaceutically acceptable carrier" is intended to include any and all solvents, dispersion 
media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, 
and the like, compatible with pharmaceutical administration. The use of such media and 
agents for pharmaceutically active substances is well known in the art. Except insofar as any 
conventional media or agent is incompatible with the active compound, use thereof in the 
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compositions is contemplated. Supplementary active compounds can also be incorporated 
into the compositions. 

The pharmaceutical compositions of the present invention may be administered in a 
number of ways depending upon whether local or systemic treatment is desired and upon the 
5 area to be treated. Administration may be topical (including ophthalmic, vaginal, rectal, 
intranasal, transdermal), oral or parenteral. Parenteral administration includes intravenous 
drip, subcutaneous, intraperitoneal or intramuscular injection, or intrathecal or 
intraventricular administration. 

The route and site of administration may be chosen to enhance targeting. For 
1 0 example, to target muscle cells, intramuscular injection into the muscles of interest would be 
a logical choice. Lung cells might be targeted by administering the iRNA in aerosol form. 
The vascular endothelial cells could be targeted by coating a balloon catheter with the iRNA 
and mechanically introducing the DNA. 

Formulations for topical administration may include transdermal patches, ointments, 
15 lotions, creams, gels, drops, suppositories, sprays, liquids and powders. Conventional 
pharmaceutical carriers, aqueous, powder or oily bases, thickeners and the like may be 
necessary or desbrable. Coated condoms, gloves and the like may also be useful. 

Compositions for oral administration include powders or granules, suspensions or 
solutions in water, syrups, elixirs or non-aqueous media, tablets, capsules, lozenges, or 
20 troches. In the case of tablets, carriers that can be used include lactose, sodium citrate and 
salts of phosphoric acid. Various disintegrants such as starch, and lubricating agents such as 
magnesium stearate, sodium lauryl sulfate and talc, are commonly used in tablets. For oral 
administration in capsule form, useful diluents are lactose and high molecular weight 
polyethylene glycols. When aqueous suspensions are required for oral use, the nucleic acid 
25 compositions can be combined with emulsifying and suspending agents. If desired, certain 
sweetening and/or flavoring agents can be added. 

Compositions for uitrathecal or intraventricular administration may include sterile 
aqueous solutions which may also contam buffers, diluents and other suitable additives. 

Formulations for parenteral administration may include sterile aqueous solutions 
30 which may also contain buffers, diluents and other suitable additives. Intraventricular 
injection may be facilitated by an intraventricular catheter, for example, attached to a 
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reservoir. For intravenous use» the total concentration of solutes should be controlled to 
render the preparation isotonic. 

For ocxilar administration, ointments or droppable liquids may be delivered by ocular 
delivery systems known to the art such as applicators or eye droppers. Such compositions can 
5 include mucomimetics such as hyaluronic acid, chondroitin sulfate, hydroxypropyl 
methylcellulose or poly(vinyl alcohol), preservatives such as sorbic acid, EDTA or 
benzylchronium chloride, and the usual quantities of diluents and/or carriers. 

Topical Delivery 

For ease of exposition the formulations, compositions and methods in this section are 

10 discussed largely with regard to unmodified iRNA agents. It should be imderstood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 
e.g., modified iRNA agents, and such practice is within the invention. In a preferred 
embodiment, an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 

15 which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) is delivered to a subject via topical administration, "Topical 
administration" refers to the delivery to a subject by contacting the formulation directly to a 
surface of the subject. The most common form of topical delivery is to the skin, but a 
composition disclosed herein can also be directly applied to other surfaces of the body, e.g., 

20 to the eye, a mucous membrane, to surfaces of a body cavity or to an internal surface. As 
mentioned above, the most common topical delivery is to the skin. The term encompasses 
several routes of administration including, but not limited to, topical and transdermal. These 
modes of administration typically include penetration of the skints permeability barrier and 
efiicient delivery to the target tissue or stratum. Topical administration can be used as a 

25 means to penetrate the epidennis and dermis and ultimately achieve systemic delivery of the 
composition. Topical administration can also be used as a means to selectively deliver 
oligonucleotides to the epidennis or dermis of a subject, or to specific strata thereof, or to an 
underlying tissue. 

The term "skin," as used herein, refers to the epidennis and/or dermis of an animal. 
30 Mammalian skin consists of two major, distinct layers. The outer layer of the skin is called 
the epidermis. The epidermis is comprised of the stratum comeum, the stratum granulosum, 
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the stratum spinosum, and the stratum basale, with the stratum comeum being at the surface 
of the skin and the stratum basale being the deepest portion of the epidermis. The epidermis 
is between 50 ^un and 0.2 mm thick, dq)ending on its location on the body. 

Beneath the epidermis is the dermis, which is significantly thicker than the epidermis, 
The dermis is primarily composed of coUagen in the form of fibrous bundles. The 
collagenous bundles provide support for, inter alia, blood vessels, lymph capillaries, glands, 
nerve endings and imraimologically active cells. 

One of the major functions of the skin as an organ is to regulate the entry of 
substances into the body. The principal permeability barrier of the skin is provided by the 
stratum comeum, which is formed firom many layers of cells in various states of 
differentiation. The spaces between cells in the stratum comeum is filled with different 
lipids arranged in lattice-like formations that provide seals to further enhance tiie skins 
permeability barrier. 

The permeability barrier provided by the skin is such that it is largely impermeable to 
molecules having molecular weight greater than about 750 Da. For larger molecules to cross 
the skin's permeabiUty barrier, mechanisms other than normal osmosis must be used. 

Several factors determine the permeabUity of the skin to administered agents. These 
factors include the characteristics of the treated skin, the characteristics of the deUvery agent, 
interactions between both the dmg and delivery agent and the dmg and skin, the dosage of 

I 

the dmg applied, the form of treatment, and the post treatment regimen. To selectively target 
the epidermis and dermis, it is sometimes possible to formulate a composition that comprises 
one or more penetration enhancers that will enable penetration of the dmg to a preselected 
stratum. 

Transdermal delivery is a valuable route for the administration of lipid soluble 
therapeutics. The dermis is more permeable than the epidermis and therefore absorption is 
much more rapid through abraded, burned or denuded skin. Inflammation and other 
physiologic conditions that increase blood flow to the skin also enhance transdermal 
adsorption. Absorption via this route may be enhanced by the use of an oily vehicle 
(inunction) or through the use of one or more penetration enhancers. Other effective ways to 
deliver a composition disclosed herein via the transdermal route include hydration of the skin 
and the use of controlled release topical patches. The transdermal route provides a 

212 



wo 2004/080406 PCT/US2004/007070 

potentially effective means to deliver a composition disclosed herein for systemic and/or 
local therapy. 

In addition, iontophoresis (transfer of ionic solutes through biological membranes 
under the influence of an electric field) (Lee et aly Critical Reviews in Therapeutic Drug 
5 Carrier Systems, 1991, p. 163), phonophoresis or sonophoresis (use of ultrasound to enhance 
the absorption of various therapeutic agents across biological membranes, notably the skin 
and the cornea) (Lee et al.^ Critical Reviews in Therapeutic Drug Carrier Systems, 1991, p. 
166), and optimization of vehicle characteristics relative to dose position and retention at the 
site of administration (Lee et aL^ Critical Reviews in Therapeutic Drug Carrier Systems, 

10 1991, p. 168) may be useful methods for enhancing the transport of topically applied 
compositions across skin and mucosal sites. 

The compositions and methods provided may also be used to examine the function of 
various proteins and genes in vitro in cultured or preserved dermal tissues and in animals. 
The invention can be thus applied to examine the function of any gene. The methods of the 

15 invention can also be used therapeutically or prophylactically. For example, for the 

treatment of animals that are knoAvn or suspected to suffer from diseases such as psoriasis, 
lichen planus, toxic epidermal necrolysis, ertythema multiforme, basal cell carcmoma, 
squamous cell carcinoma, malignant melanoma, Paget's disease, Kaposi's sarcoma, 
pulmonary fibrosis, Lyme disease and viral, fungal and bacterial infections of the skin. 

20 

Pulmonary Delivery 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
that these formulations, compositions and methods can be practiced with other iRNA agents, 

25 e.g., modified iRNA agents, and such practice is within the invention. A composition that 
includes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof) can be administered to a subject by pulmonary delivery. Pulmonary 

30 delivery compositions can be deUvered by inhalation by the patient of a dispersion so that the 
composition, preferably iRNA, within the dispersion can reach the lung where it can be 
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readily absorbed through the alveolar region directly into blood circulation. Pulmonary 
delivery can be effective both for systemic delivery and for localized delivery to treat 

diseases of the lungs. 

Puhnonary delivery can be achieved by different approaches, including the use of 

5 nebulized, aerosoUzed, micellular and dry powder-based formulations. Delivery can be 
achieved with liquid nebulizers, aerosol-based inhalers, and dry powder dispersion devices. 
Metered-dose devices are prefen-ed. One of the benefits of using an atomizer or inhaler is 
that the potential for contamination is minimized because the devices are self contained. Dry 
powder dispersion devices, for example, deliver drugs that may be readily formulated as dry 

10 powders. A iKNA composition may be stably stored as lyophilized or spray-dried powders 
by itself or in combmation vnih suitable powder carriers. The delivery of a composition for 
inhalation can be mediated by a dosing timing element which can include a timer, a dose 
counter, time measuring device, or a time indicator which when incorporated mto the device 
enables dose tracking, compliance monitoring, and/or dose triggering to a patient during 

15 administration of the aerosol medicament. 

The term "powder" means a composition that consists of finely dispersed solid 
particles that are free flowing and capable of being readily dispersed in an inhalation device 
and subsequently inhaled by a subject so that the particles reach the lungs to permit 
penetration into the alveoli. Thus, the powder is said to be "respirable." Preferably the 

20 average particle size is less than about 10 jim in diameter preferably with a relatively uniform 
spheroidal shape distribution. More preferably the diameter is less than about 7.5 jmi and 
most preferably less than about 5.0 \im. Usually the particle size distribution is between 
about 0.1 ^m and about 5 |xm in diameter, particularly about 0.3 \xm to about 5 jmi. 

The term "dry" means that the composition has a moisture content below about 10% 

25 by weight (% w) water, usually below about 5% w and preferably less it than about 3% w, A 
dry composition can be such that the particles are readily dispersible in an inhalation device 
to form an aerosol. 

The term "therapeutically effective amount" is the amount present in the composition 
that is needed to provide the desired level of drug in the subject to be treated to give the 
30 anticipated physiological response. 
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The term "physiologically effective amount*^ is that amount delivered to a subject to 
give the desired palliative or curative effect. 

The term "pharmaceutically acceptable carrier" means that the carrier can be taken 
into the lungs with no significant adverse toxicological effects on the lungs. 
5 The types of pharmaceutical excipients that are useful as carrier include stabilizers 

such as human serum albumin (HSA), bulking agents such as carbohydrates, amino acids and 
polypeptides; pH adjusters or buffers; salts such as sodium chloride; and the like. These 
carriers may be in a crystalline or amorphous form or may be a mixture of the two. 

Bulking agents that are particularly valuable include compatible carbohydrates, 
10 polypeptides, amino acids or combinations thereof. Suitable carbohydrates include 

monosaccharides such as galactose, D-mannose, sorbose, and the like; disaccharides, such as 
lactose, trehalose, and the like; cyclodextrins, such as 2-hydroxypropyl-.beta.-cyclodextrin; 
and polysaccharides, such as raffinose, maltodextrins, dextrans, and the like; alditols, such as 
mannitol, xylitol, and the like. A preferred group of carbohydrates includes lactose, 
15 threhalose, raffmose maltodextrins, and mannitol. Suitable polypeptides include aspartame. 
Amino acids include alanine and glycine, with glycine being preferred. 

Additives, which are minor components of the composition of this invention, may be 
included for conformational stability during spray drying and for improving dispersibility of 
the powder. These additives include hydrophobic amino acids such as tryptophan, tyrosine, 
20 leucine, phenylalanine, and the like. 

Suitable pH adjusters or buffers include organic salts prepared firom organic acids and 
bases, such as sodium citrate, sodium ascorbate, and the like; sodium citrate is preferred. 

Puhnonary administration of a micellar iRNA formulation may be achieved through 
metered dose spray devices with propellants such as tetrafluoroethane, heptafluoroethane, 
26 dimethylfluoropropane, tetrafluoropropane, butane, isobutane, dimethyl ether and other non- 
CFC and CFC propellants. 

Oral or Nasal Delivery 

For ease of exposition the formulations, compositions and methods in this section are 
discussed largely with regard to unmodified iRNA agents. It should be understood, however, 
30 that these formulations, compositions and methods can be practiced with other iRNA agents, 
e,g,y modified iRNA agents, and such practice is Avithin the invention. Both the oral and 
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nasal membranes offer advantages over other routes of administration. For example, drugs 
administered through these membranes have a rapid onset of action, provide therapeutic 
plasma levels, avoid first pass effect of hepatic metabolism, and avoid exposure of the drug 
to the hostile gastrointestinal (GI) environment. Additional advantages include easy access 
5 to the membrane sites so that the drug can be applied, localized and removed easily. 

In oral delivery, compositions can be targeted to a surface of the oral cavity, e.g., to 
sublingual mucosa v^hich includes the membrane of ventral surface of the tongue and the 
floor of the mouth or the buccal mucosa which constitutes the lining of the cheek. The 
sublingual mucosa is relatively permeable thus giving rapid absorption and acceptable 
10 bioavailability of many drugs. Further, the sublingual mucosa is convenient, acceptable and 
easily accessible. 

The ability of molecules to permeate through the oral mucosa appears to be related to 
molecular size, lipid solubility and peptide protein ionization. Small molecules, less than 
1 000 daltons appear to cross mucosa rapidly. As molecular size increases, the permeability 
15 decreases rapidly. Lipid soluble compounds are more permeable than non-lipid soluble 
molecules. Maximum absorption occurs when molecules are un-ionized or neutral in 
electrical charges. Therefore charged molecules present the biggest challenges to absorption 

« 

through the oral mucosae. 

A pharmaceutical composition of iRNA may also be administered to the buccal cavity 
20 of a human being by spraying into the cavity, without inhalation, firom a metered dose spray 
dispenser, a mixed micellar pharmaceutical formulation as described above and a propellant. 
In one embodiment, the dispenser is first shaken prior to spraying the pharmaceutical 
formulation and propellant into the buccal cavity. 

Devices 

25 For ease of exposition the devices, formulations, compositions and methods in this 

section are discussed largely vdth regard to unmodified iRNA agents. It should be 
understood, however, that these devices, formulations, compositions and methods can be 
practiced with other iRNA agents, e.g.^ modified iRNA agents, and such practice is within the 
invention, AniRNAagent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 

30 precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
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precursor thereof) can be disposed on or in a device, e.g., a device which implanted or 
otherwise placed in a subject. Exemplary devices include devices which are introduced into 
the vasculature, e.g., devices inserted into the lumen of a vascular tissue, or which devices 
themselves form a part of the vasculature, including stents, catheters, heart valves, and other 
vascular devices. These devices, e.g., catheters or stents, can be placed in the vasculature of 
the lung, heart, or leg. 

Other devices include non-vascular devices, e.g., devices implanted in the 
peritoneum, or in organ or glandular tissue, e.g., artificial organs. The device can release a 
therapeutic substance in addition to a iRNA, e.g., a device can release insulin. 

Other devices include artificial joints, e.g., hip joints, and other orthopedic implants. 

In one embodiment, unit doses or measured doses of a composition that includes 
iRNA are dispensed by an implanted device. The device can include a sensor that monitors a 
parameter within a subject. For example, the device can include pump, e.g., and, optionally, 
associated electronics. 

Tissue, e.g., cells or organs can be treated with An iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) ,ex vivo and then administered or 
implanted in a subject. 

The tissue can be autologous, allogeneic, or xenogeneic tissue. Kg., tissue can be 
treated to reduce graft v. host disease. In other embodunents, the tissue is allogeneic and the 
tissue is treated to treat a disorder characterized by imwanted gene expression in that tissue. 
Kg., tissue, e.g., hematopoietic cells, e.g., bone marrow hematopoietic cells, can be treated to 
inhibit unwanted cell proliferation. 

Introduction of treated tissue, whether autologous or transplant, can be combined with 
other therapies. 

In some implementations, the iRNA treated cells are insulated fi'om other cells, e.g., 
by a semi-permeable porous barrier that prevents the cells firom leaving the implant, but 
enables molecules from the body to reach the cells and molecules produced by the cells to 
enter the body. In one embodiment, the porous barrier is formed from alginate. 
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In one embodiment, a contraceptive device is coated with or contains an iKNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA 
agent which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, 
e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof). Exemplary 
5 devices include condoms, diaphragms, lUD (implantable uterine devices, sponges, vaginal 
sheaths, and birth control devices. In one embodiment, the iRNA is chosen to inactive sperm 
or egg. In another embodiment, the iRNA is chosen to be complementary to a viral or 
pathogen RNA, e.g., an RNA of an STD. In some instances, the iRNA composition can 
include a spermicide. 

10 DOSAGE 

In one aspect, the invention features a method of administering an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, to a subject (e.g., a human subject). The 
method includes administering a unit dose of the iRNA agent, e.g., a sRNA agent, e.g., 
double stranded sRNA agent that (a) the double-stranded part is 19-25 nucleotides (nt) long, 

15 preferably 21-23 nt, (b) is complementary to a target RNA (e.g., an endogenous or pathogen 
target RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nucleotide long. In 
one embodiment, the imit dose is less than 1.4 mg per kg of body weight, or less than 10, 5, 2, 
1, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, 0.0001, 0.00005 or 0.00001 mg per kg of 
bodyweight, and less than 200 nmole of RNA agent (e,g. about 4.4 x lO"^ copies) per kg of 

20 bodyweight, or less than 1500, 750, 300, 150, 75, 15, 7.5, 1.5, 0.75, 0.15, 0.075, 0.015, 
0.0075, 0.0015, 0.00075, 0.00015 nmole of RNA agent per kg of bodyweight. 

The defined amoxmt can be an amount effective to treat or prevent a disease or 
disorder, e.g., a disease or disorder associated with the target RNA. The unit dose, for 
example, can be administered by injection (e.g., intravenous or intramuscular), an inhaled 

25 dose, or a topical application. Particularly preferred dosages are less than 2, 1, or 0.1 mg/kg 
of body weight. 

In a preferred embodiment, the unit dose is administered less firequently than once a 
day, e.g., less than every 2, 4, 8 or 30 days. In another embodiment, the unit dose is not 
administered with a frequency (e.g., not a regular frequency). For example, the unit dose 
30 may be administered a single time. 
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mm 

In one embodiment, the effective dose is administered with other traditional 
therapeutic modalities. In one embodiment, the subject has a viral infection and the modality 
is an antiviral agent other than an iRNA agent, e.g., other than a double-stranded iRNA 
agent, or sRNA agent,. In another embodiment, the subject has atherosclerosis and the 

5 effective dose of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, is 
administered in combination with, e.g., after surgical intervention, e.g., angioplasty. 

In one embodiment, a subject is administered an initial dose and one or more 
maintenance doses of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a 

10 DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof). The maintenance dose or doses are generally lower than the initial dose, 
e.g., one-half less of the initial dose. A maintenance regimen can include treating the subject 
with a dose or doses rangmg from 0.01 ^ig to 1 .4 mg/kg of body weight per day, e.g., 10, 1 , 
0,1, 0.01, 0.001, or 0.00001 mg per kg of bodyweight per day. The maintenance doses are 

15 preferably administered no more than once every 5, 10, or 30 days. Further, the treatment 
regimen may last for a period of time which will vary depending upon the nature of the 
particular disease, its severity and the overall condition of the patient. In preferred 
embodiments the dosage may be delivered no more than once per day, e.g., no more than 
once per 24, 36, 48, or more hours, e.g., no more than once for every 5 or 8 days. Following 

20 treatment, the patient can be monitored for changes in his condition and for alleviation of the 
symptoms of the disease state. The dosage of the compoimd may either be increased in the 
event the patient does not respond significantly to current dosage levels, or the dose may be 
decreased if an alleviation of the symptoms of the disease state is observed, if the disease 
state has been ablated, or if undesired side-effects are observed. 

25 The effective dose can be administered in a single dose or in two or more doses, as 

desired or considered appropriate under the specific circimistances. If desired to facilitate 
repeated or frequent infiisions, implantation of a delivery device, e.g., a pump, semi- 
permanent stent (e.g., intravenous, intraperitoneal, intracistemal or intracapsular), or 
reservoir may be advisable. 

30 In one embodiment, the iRNA agent pharmaceutical composition includes a plurality 

of iRNA agent species. In another embodiment, the iRNA agent species has sequences that 

219 



wo 2004/080406 



PCTAJS2004/007070 



are non-overlapping and non-adjacent to another species with respect to a naturally occurring 
target sequence. In another embodiment, the plurality of iRNA agent species is specific for 
different naturally occuning target genes. In another embodiment, the iRNA agent is allele 
specific. 

5 In some cases, a patient is treated with a iRNA agent in conjunction with other 

therapeutic modalities. For example, a patient being treated for a viral disease, e.g. an HIV 
associated disease (e.g., AIDS), may be administered a iRNA agent specific for a target gene 
essential to the vuiis in conjunction with a known antiviral agent (e.g., a protease inhibitor or 
reverse transcriptase mhibitor). In another example, a patient being treated for cancer may be 

10 administered a iRNA agent specific for a target essential for tumor cell proliferation in 
conjunction with a chemotherapy. 

Following successful treatment, it may be desirable to have the patient undergo 
maintenance therapy to prevent the recurrence of the disease state, wherein the compound of 
the invention is administered in maintenance doses, ranging fi:om 0.01 p.g to 100 g per kg of 

16 body weight (see US 6,107,094). 

The concentration of the iRNA agent composition is an amount sufiBcient to be 
effective in treating or preventing a disorder or to regulate a physiological condition in 
humans. The concentration or amount of iRNA agent administered will depend on the 
parameters determined for the agent and the method of administration, e.g. nasal, buccal, 

20 pulmonary. For example, nasal formulations tend to require much lower concentrations of 
some ingredients in order to avoid irritation or burning of the nasal passages. It is sometimes 
desirable to dilute an oral formulation up to 10-100 times in order to provide a suitable nasal 
formulation. 

Certain factors may influence the dosage required to effectively treat a subject, 
25 including but not limited to the severity of the disease or disorder, previous treatments, the 
general health and/or age of the subject, and other diseases present. Moreover, treatment of a 
subject with a therapeutically effective amount of an iRNA agent, e.g., a double-stranded 
iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be 
processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
30 stranded iRNA agent, or sRNA agent, or precursor thereof) can include a single treatment 
or, preferably, can include a series of treatments. It will also be appreciated that the effective 
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dosage of a iRNA agent such as a sRNA agent used for treatment may increase or decrease 
over the course of a particular treatment. Changes in dosage may result and become apparent 
from the results of diagnostic assays as described herein. For example, the subject can be 
monitored after administering a iRNA agent composition. Based on information from the 
5 monitoring, an additional amount of the iRNA agent composition can be administered. 

Dosing is dependent on severity and responsiveness of the disease condition to be 
treated, with the course of treatment lasting from several days to several months, or until a 
cure is effected or a diminution of disease state is achieved. Optimal dosing schedules can be 
calculated from measurements of drug accumulation in the body of the patient. Persons of 

10 ordinary skill can easily determine optimum dosages, dosing methodologies and repetition 
rates. Optimum dosages may vary depending on the relative potency of individual 
compounds, and can generally be estimated based on EC50s found to be effective in in vitro 
and in vivo animal models. In some embodiments, the animal models include transgenic 
animals that express a human gene, e.g. a gene that produces a target RNA. The transgenic 

15 animal can be deficient for the corresponding endogenous RNA. In another embodiment, the 
composition for testing includes a iRNA agent that is complementary, at least in an internal 
region, to a sequence that is conserved between the target RNA m the animal model and the 
target RNA in a human. 

The inventors have discovered that iRNA agents described herein can be administered 

20 to mammals, particularly large mammals such as nonhuman primates or humans in a number 
of ways. 

In one embodiment, the administration of the iRNA agent, e.g., a double-stranded 
iRNA agent, or sKNA agent, composition is parenteral, e,g. uitravenous (e.g., as a bolus or as 
a diffusible infusion), intradermal, intraperitoneal, intramuscular, intrathecal, intraventricular, 
25 intracranial, subcutaneous, transmucosal, buccal, sublingual, endoscopic, rectal, oral, vaginal, 
topical, pulmonary, intranasal, urethral or ocular. Administration can be provided by the 
subject or by another person, e.g., a health care provider. The medication can be provided in 
measured doses or in a dispenser which delivers a metered dose. Selected modes of delivery 
are discussed in more detail below. 
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The invention provides methods, compositions^ and kits, for rectal administration or 
delivery of iRNA agents described herein. 

Accordingly, an iRNA agent, e.g., sl double-stranded iRNA agent, or sRNA agent, 
(e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent , or a 

6 DNA which encodes a an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, 
or precursor thereof) described herein, e.g., a therapeutically effective amount of a iRNA 
agent described herein, e.g., a iRNA agent having a double stranded region of less than 40, 
and preferably less than 30 nucleotides and having one or two 1-3 nucleotide single strand 3' 
overhangs can be administered rectally, e.g., introduced through the rectum into the lower or 

10 upper colon. This approach is particularly useful in the treatment of, inflanunatory disorders, 
disorders characterized by imwanted cell proliferation, e.g., polyps, or colon cancer. 

The medication can be delivered to a site in the colon by introducing a dispensing 
device, e.g., a flexible, camera-guided device similar to that used for inspection of the colon 
or removal of polyps, which includes means for delivery of the medication. 

15 The rectal administration of the iRNA agent is by means of an enema. The iRNA 

agent of the enema can be dissolved in a saline or buffered solution. The rectal 
administration can also by means of a suppository, which can include other ingredients, e.g., 
an excipient, e.g., cocoa butter or hydropropylmethylcellulose. 

Any of the iRNA agents described herein can be administered orally, e.g., in the form 

20 of tablets, capsules, gel capsules, lozenges, troches or liquid syrups. Further, the composition 
can be applied topically to a surface of the oral cavity. 

Any of the iRNA agents described herein can be administered buccally. For example, 
the medication can be sprayed into the buccal cavity or applied directly, e.g., in a liquid, 
solid, or gel form to a surface in the buccal cavity. This administration is particularly 

25 desirable for the treatment of inflammations of the buccal cavity, e.g., the gums or tongue, 
e.g., in one embodiment, the buccal administration is by spraying into the cavity, e.g., 
without inhalation, from a dispenser, e.g., a metered dose spray dispenser that dispenses the 
pharmaceutical composition and a propellant. 

Any of the iRNA agents described herein can be administered to ocular tissue. For 

30 example, the medications can be applied to the surface of the eye or nearby tissue, e.g., the 
inside of tlie eyelid. They can be applied topically, e.g., by spraying, in drops, as an 
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eyewash^ or an ointment. Administration can be provided by the subject or by another 
person, e.g., a health care provider. The medication can be provided in measured doses or in 

■ 

a dispenser which delivers a metered dose. The medication can also be administered to the 
interior of the eye, and can be introduced by a needle or other delivery device which can 

5 introduce it to a selected area or structure. Ocular treatment is particularly desirable for 
treating inflammation of the eye or nearby tissue. 

Any of the iRNA agents described herein can be administered directly to the skin. 
For example, the medication can be applied topically or delivered in a layer of the skin, e.g., 
by the use of a microneedle or a battery of microneedles which penetrate into the skin, but 

10 preferably not into the xmderlying muscle tissue. Administration of the iRNA agent 

composition can be topical. Topical applications can, for example, deliver the composition 
to the dermis or epidermis of a subject. Topical administration can be in the form of 
transdermal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids or 
powders. A composition for topical administration can be formulated as a liposome, micelle, 

15 emulsion, or other lipophilic molecular assembly. The transdermal administration can be 
applied with at least one penetration enhancer, such as iontophoresis, phonophoresis, and 
sonophoresis. 

Any of the iRNA agents described herein can be administered to the pulmonary 
system. Pulmonary administration can be achieved by inhalation or by the introduction of a 
20 delivery device into the puhnonary system, e.g., by introducing a delivery device which can 
dispense the medication. A preferred method of pulmonary delivery is by inhalation. The 
medication can be provided in a dispenser which delivers the medication, e.g., wet or dry, in 
a form sufficiently small such that it can be mhaled. The device can deliver a metered dose 
of medication. The subject, or another person, can administer the medication. 
25 Pulmonary delivery is effective not only for disorders which durectly affect 

puhnonary tissue, but also for disorders which affect other tissue. 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or 
aerosol for pulmonary delivery. 

Any of the iRNA agents described herein can be administered nasally. Nasal 
30 administration can be achieved by introduction of a delivery device into the nose, e.g., by 
introducing a delivery device which can dispense the medication. Methods of nasal delivery 
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include spray, aerosol, liquid, e.g., by drops, or by topical administration to a surface of the 
nasal cavity. The medication can be provided in a dispenser with delivery of the medication, 
e.g., wet or dry, in a form sufficiently small such that it can be inhaled. The device can 
deliver a metered dose of medication. The subject, or another person, can administer the 
5 medication. 

Nasal delivery is effective not only for disorders which directly affect nasal tissue, but 
also for disorders which affect other tissue 

iRNA agents can be formulated as a liquid or nonliquid, e.g., a powder, crystal, or for 
nasal delivery. 

10 An iKNA agent can be packaged in a viral natural capsid or in a chemically or 

enzymatically produced artificial capsid or structure derived therefi'om. 

The dosage of a pharmaceutical composition including a iRNA agent can be 
administered in order to alleviate the symptoms of a disease state, e.g., cancer or a 
cardiovascular disease. A subject can be treated with the pharmaceutical composition by any 

15 of the methods mentioned above. 

Gene expression in a subject can be modulated by administering a pharmaceutical 
composition including an iRNA agent. 

A subject can be treated by administering a defined amoimt of an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent 

20 which can be processed into a sRNA agent) composition that is in a powdered form, e.g., a 
collection of microparticles, such as crystalline particles. The composition can include a 
plurality of iRNA agents, e.g., specific for one or more different endogenous target RNAs. 
The method can include other features described herein. 

A subject can be treated by administering a defined amount of an iRNA agent 

25 ' composition that is prepared by a method that includes spray-drying, /.e. atomizing a liquid 
solution, emulsion, or suspension, immediately exposing the droplets to a drying gas, and 
collecting the resulting porous powder particles. The composition can include a plurality of 
iRNA agents, e.g., specific for one or more different endogenous target RNAs. The method 
can include other features described herein. 

30 The iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 

precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
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which encodes an iKNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereof), can be provided in a powdered, crystallized or other finely divided form, 
with or without a carrier, e.g., a micro- or nano-particle suitable for inhalation or other 
pulmonary delivery. This can include providing an aerosol preparation, e.g., an aerosolized 
5 spray-dried composition. The aerosol composition can be provided in and/or dispensed by a 
metered dose delivery device. 

The subject can be treated for a condition treatable by inhalation, e.g., by aerosolizing 
a spray-dried iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
10 which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or 
precursor thereoQ composition and inhaling the aerosolized composition. The iRNA agent 
can be an sRNA. The composition can include a plurality of iRNA agents, e.g., specific for 
one or more different endogenous target RNAs. The method can include other features 
described herein. 

15 A subject can be treated by, for example, admmistering a composition including an 

effective/defined amount of an iRNA agent, e.g., a double-stranded iRNA agent, or sRNA 
agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a sRNA 
agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, or 
sRNA agent, or precursor thereof), wherein the composition is prepared by a method that 

20 includes spray-drying, lyophilization, vacuum drying, evaporation, fluid bed drying, or a 
combination of these techniques 

In another aspect, the invention features a method that includes: evaluating a 
parameter related to the abundance of a transcript in a cell of a subject; comparing the 
evaluated parameter to a reference value; and if the evaluated parameter has a preselected 

25 relationship to the reference value (e.g., it is greater), administering a iRNA agent (or a 
precursor, e.g., a larger iRNA agent which can be processed into a sRNA agent, or a DNA 
which encodes a iRNA agent or precursor thereof) to the subject. In one embodiment, the 
iRNA agent includes a sequence that is complementary to the evaluated transcript For 
example, the parameter can be a direct measure of transcript levels, a measure of a protein 

30 level, a disease or disorder symptom or characterization (e.g., rate of cell proliferation and/or 
tumor mass, viral load,) 
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In another aspect, the invention features a method that includes: administering a first 
araoimt of a composition that comprises an iRNA agent, e.g., a double-stranded iRNA agent, 
or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can be processed into a 
sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA agent, 
6 or sRNA agent, or precursor thereof) to a subject, wherein the iRNA agent includes a strand 
substantially complementary to a target nucleic acid; evaluating an activity associated with a 
protein encoded by the target nucleic acid; wherein the evaluation is used to determine if a 
second amount should be administered. In a preferred embodiment the method includes 
administering a second amount of the composition, wherein the timing of administration or 

10 dosage of the second amount is a function of the evaluating. The method can include other 
features described herein. 

In another aspect, the invention features a method of administering a source of a 
double-stranded IRNA agent (ds iRNA agent) to a subject. The method includes 
administering or implanting a source of a ds iRNA agent, e.g., a sRNA agent, that (a) 

15 includes a double-stranded region that is 19-25 nucleotides long, preferably 21-23 

nucleotides, (b) is complementary to a target RNA (e.g., an endogenous RNA or a pathogen 
RNA), and, optionally, (c) includes at least one 3' overhang 1-5 nt long. In one embodiment, 
the source releases ds iRNA agent over time, e.g. the source is a controlled or a slow release 
source, e.g., a microparticle that gradually releases the ds iRNA agent. In another 

20 embodiment, the source is a pump, e.g., a pump that includes a sensor or a pump that can 
release one or more unit doses. 

In one aspect, the invention features a pharmaceutical composition that includes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 

25 iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) 
including a nucleotide sequence complementary to a target RNA, e.g., substantially and/or 
exactly complementary. The target RNA can be a transcript of an endogenous human gene. 
In one embodiment, the iRNA agent (a) is 19-25 nucleotides long, preferably 21-23 
nucleotides, (b) is complementary to an endogenous target RNA, and, optionally, (c) includes 

30 at least one 3' overhang 1 -5 nt long. In one embodiment, the pharmaceutical composition can 
be an emulsion, microemulsion, cream, jelly, or liposome. 
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In one example the pharmaceutical composition includes an iRNA agent mixed with a 
topical delivery agent. The topical delivery agent can be a plurality of microscopic vesicles. 
The microscopic vesicles can be liposomes. In a preferred embodiment the liposomes are 
cationic liposomes. 

In another aspect, the pharmaceutical composition includes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, or precursor thereof) admixed with a topical 
penetration enhancer. In one embodiment, the topical penetration enhancer is a fatty acid. 
The fatty acid can be arachidonic acid, oleic acid, lauric acid, caprylic acid, capric acid, 
myristic acid, pahnitic acid, stearic acid, linoleic acid, linolenic acid, dicaprate, tricaprate, 
monolein, dilaurin, glyceryl 1 -monocaprate, 1 -dodecylazacycloheptan-2-one, an 
acylcamitine, an acylcholine, or a Cmo alkyl ester, monoglyceride, diglyceride or 
pharmaceutically acceptable salt thereof. 

In another embodiment, the topical penetration enhancer is a bile salt. The bile salt 
can be cholic acid, dehydrocholic acid, deoxycholic acid, glucholic acid, glycholic acid, 
glycodeoxychohc acid, taurocholic acid, taurodeoxycholic acid, chenodeoxycholic acid, 
ursodeoxycholic acid, sodium tauro-24,25-dihydro-fusidate, sodium glycodihydrofusidate, 
polyoxyethylene-9-lauryl ether or a pharmaceutically acceptable salt thereof. 

In another embodiment, the penetration enhancer is a chelating agent The chelating 
agent can be EDTA, citric acid, a salicyclate, aN-acyl derivative of coUagen, laureth-9, an 
N-amino acyl derivative of a beta-diketone or a mixture thereof 

In another embodiment, the penetration enhancer is a surfactant, e.g., an ionic or 
nonionic surfactant. The surfactant can be sodium lauryl sulfate, polyoxyethylene-9-lauryl 
ether, polyoxyethylene-20-cetyl ether, a perfluorchemical emulsion or mixture thereof 
In another embodiment, the penetration enhancer can be selected from a group 
consisting of unsaturated cyclic ureas, 1-alkyl-alkones, 1-alkenylazacyclo-alakanones, 
steroidal anti-inflammatory agents and mixtures thereof In yet another embodiment the 
penetration enhancer can be a glycol, a pyrrol, an azone, or a teipenes. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
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larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
form suitable for oral delivery. In one embodiment, oral delivery can be used to deliver an 
iRNA agent composition to a cell or a region of the gastro-intestinal tract, e.g., small 

5 intestine, colon (e.g., to treat a colon cancer), and so forth. The oral delivery form can be 
tablets, capsules or gel capsules. In one embodiment, the iRNA agent of the pharmaceutical 
composition modulates expression of a cellular adhesion protein, modulates a rate of cellular 
proliferation, or has biological activity against eukaryotic pathogens or retroviruses. In 
another embodiment, the pharmaceutical composition includes an enteric material that 

10 substantially prevents dissolution of the tablets, capsules or gel capsules in a mammalian 
stomach. In a preferred embodiment the enteric material is a coating. The coating can be 
acetate phthalate, propylene glycol, sorbitan monoleate, cellulose acetate trimellitate, 
hydroxy propyl methylcellulose phthalate or cellulose acetate phthalate. 

In another embodiment, the oral dosage foim of the pharmaceutical composition 

15 includes a penetration enhancer. The penetration enhancer can be a bile salt or a fatty acid. 
The bile salt can be ursodeoxycholic acid, chenodeoxycholic acid, and salts thereof. The 
fatty acid can be capric acid, lauric acid, and salts thereof. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes an excipient. In one example the excipient is polyethyleneglycol. In another 

20 example the excipient is precurol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
includes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or triethyl citrate. 

In one aspect, the invention features a pharmaceutical composition including an 

25 iRNA agent and a delivery vehicle. In one embodiment, the iRNA agent is (a) is 19-25 

nucleotides long, preferably 21-23 nucleotides, (b) is complementary to an endogenous target 
RNA, and, optionally, (c) includes at least one 3' overhang 1-5 nucleotides long. 

In one embodiment, the delivery vehicle can deliver an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent which can 

30 be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) to a cell by a topical route of 
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administration. The delivery vehicle can be microscopic vesicles. In one example the 
microscopic vesicles are liposomes. La a preferred embodiment the liposomes are cationic 
liposomes. In another example the microscopic vesicles are micelles.In one aspect, the 
invention features a pharmaceutical composition including an iRNA agent, e.g., a double- 

5 stranded iKNA agent, or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) in an injectable dosage form. In 
one embodiment, the injectable dosage form of the pharmaceutical composition includes 
sterile aqueous solutions or dispersions and sterile powders. In a preferred embodiment the 

10 sterile solution can include a diluent such as water; saline solution; fixed oils, polyethylene 
glycols, glycerin, or propylene glycol. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 

15 iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) m 
oral dosage form. In one embodiment, the oral dosage form is selected firom the group 
consisting of tablets, capsules and gel capsules. In another embodiment, the pharmaceutical 
composition includes an enteric material that substantially prevents dissolution of the tablets, 
capsules or gel capsules in a mammalian stomach. In a preferred embodiment the enteric 

20 material is a coating. The coating can be acetate phthalate, propylene glycol, sorbitan 
monoleate, cellulose acetate trimellitate, hydroxy propyl methyl cellulose phthalate or 
cellulose acetate phthalate. In one embodiment, the oral dosage form of the pharmaceutical 
composition includes a penetration enhancer, e.g., a penetration enhancer described herein. 
In another embodiment, the oral dosage form of the pharmaceutical composition 

25 includes an excipient. In one example the excipient is polyethyleneglycol. In another 
example the excipient is precirol. 

In another embodiment, the oral dosage form of the pharmaceutical composition 
mcludes a plasticizer. The plasticizer can be diethyl phthalate, triacetin dibutyl sebacate, 
dibutyl phthalate or triethyl citrate. 

30 In one aspect, the invention features a pharmaceutical composition including an 

iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
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larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precurisor thereof) in a 
rectal dosage form. In one embodiment, the rectal dosage form is an enema. In another 
embodiment, the rectal dosage form is a suppository. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
vaginal dosage form. In one embodiment, the vaginal dosage form is a suppository. In 
another embodiment, the vaginal dosage form is a foam, cream, or gel. 

In one aspect, the invention features a pharmaceutical composition including an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a 
larger iRNA agent which can be processed into a sRNA agent, or a DNA which encodes an 
iRNA agent, e.g., a double-stranded iRNA agent, or sRNA agent, or precursor thereof) in a 
puhnonary or nasal dosage form. In one embodiment, the iRNA agent is incorporated into a 
particle, e.g., a macroparticle, e.g., a microsphere. The particle can be produced by spray 
drying, lyophilization, evaporation, fluid bed drying, vacuum drying, or a combination 
thereof. The microsphere can be formulated as a suspension, a powder, or an implantable 
solid. 

In one aspect, the invention features a spray-dried iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, {e.g., a precursor, e.g., a larger iRNA agent which can 
be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double- 
stranded iRNA agent, or sRNA agent, or precursor thereof) composition suitable for 
inhalation by a subject, including: (a) a therapeutically effective amount of a iRNA agent 
sxiitable for treating a condition in the subject by inhalation; (b) a pharmaceutically 
acceptable excipient selected from the group consisting of carbohydrates and amino acids; 
and (c) optionally, a dispersibility-enhancing amount of a physiologically-acceptable, water- 
soluble polypeptide. 

In one embodiment, the excipient is a carbohydrate. The carbohydrate can be 
selected from the group consisting of monosaccharides, disaccharides, trisaccharides, and 
polysaccharides. In a preferred embodiment the carbohydrate is a monosaccharide selected 
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from the group consisting of dextrose, galactose, mannitol, D-mannose, sorbitol, and sorbose. 
In another preferred embodiment the carbohydrate is a disaccharide selected from the group 
consisting of lactose, maltose, sucrose, and trehalose. 

In another embodiment, the excipient is an amino acid. In one embodiment, the 
5 amino acid is a hydrophobic amino acid. In a preferred embodiment the hydrophobic amino 
acid is selected from the group consisting of alanine, isoleucine, leucine, methionine, 
phenylalanine, proline, tryptophan, and valine. In yet another embodiment the amino acid is a 
polar amino acid. In a preferred embodiment the amino acid is selected from the group 
consisting of arginine, histidine, lysine, cysteine, glycine, glutamine, serine, threonine, 
10 tyrosine, aspartic acid and glutamic acid. 

In one embodiment, the dispersibility-enhancing polypeptide is selected from the 
group consisting of human serum albumin, a-lactalbumin, trypsinogen, and polyalanine. 

In one embodiment, the spray-dried IRNA agent composition includes particles 
having a mass median diameter (MMD) of less than 10 microns. In another embodiment, 
15 the spray-dried iRNA agent composition includes particles having a mass median diameter of 
less than 5 microns. In yet another embodiment the spray-dried iRNA agent composition 
includes particles having a mass median aerodynamic diameter (MMAD) of less than 5 
microns. 

In certain other aspects, the invention provides kits that include a suitable container 
20 containing a pharmaceutical formulation of an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent w^hich can be processed 
into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a double-stranded iRNA 
agent, or sRNA agent, or precursor thereof). In certain embodiments the individual 
components of the pharmaceutical formulation may be provided in one container. 
25 Alternatively, it may be desirable to provide the components of the pharmaceutical 

formulation separately in two or more containers, e.g., one container for an iRNA agent 
preparation, and at least another for a carrier compound. The kit may be packaged in a 
nvraiber of different configurations such as one or more containers in a single box. The 
different components can be combined, e.g., according to instructions provided with the kit. 
30 The components can be combined according to a method described herein, e.g., to prepare 
and administer a pharmaceutical composition. The kit can also include a delivery device. 
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In another aspect, the invention features a device, e.g., an implantable device, wherein 
the device can dispense or administer a composition that includes an iRNA agent, e.g., a 
double-stranded iRNA agent, or sRNA agent, (e.g., a precursor, e.g., a larger iRNA agent 
which can be processed into a sRNA agent, or a DNA which encodes an iRNA agent, e.g., a 
5 double-stranded iRNA agent, or sRNA agent, or precursor thereof), e.g., a iRNA agent that 
silences an endogenous transcript. In one embodiment, the device is coated with the 
composition. In another embodiment the iRNA agent is disposed within the device. In 
another embodiment, the device includes a mechanism to dispense a unit dose of the 
composition. In other embodiments the device releases the composition continuously, e.g., 
10 by diffusion. Exemplary devices include stents, catheters, pumps, artificial organs or organ 
components (e.g., artificial heart, a heart valve, etc.), and sutures. 

As used herein, the term "crystalline" describes a solid having the structure or 
characteristics of a crystal, i.e., particles of three-dimensional structure in which the plane 
faces intersect at definite angles and in which there is a regular mtemal structure. The 
15 compositions of the invention may have different crystalline forms. Crystalline forms can be 
prepared by a variety of methods, including, for example, spray drying. 

The invention is further illustrated by the following examples, which should not be 
construed as further limiting. 



20 EXAMPLES 

Example 1: Inhibition of endogenous ApoM gene expression in mice 

Apolipoprotein M (ApoM) is a human apolipoprotein predominantly present in high- 
density lipoprotein (HDL) in plasma. ApoM is reported to be expressed exclusively in liver 
and in kidney (XuN e/ aL, Biochem J Biol Chem 1999 Oct 29;274(44):3 1286-90). Mouse 

25 ApoM is a 2 IkD membrane associated protein, and, in serum, the protein is associated with 
HDL particles, ApoM gene expression is regulated by the transcription factor hepatocyte 
nuclear factor 1 alpha (Hnf-la), as Hnf-la ^' mice are ApoM deficient. In humans, mutations 
in the HNF-1 alpha gene represent a common cause of maturity-onset diabetes of the young 
(MODY). 
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A variety of test iRNAs were synthesized to target the mouse ApoM gene. This gene 
was chosen in part because of its high expression levels and exclusive activity in the liver and 
kidney. 

Three different classes of dsRNA agents were synthesized, each class having different 
modifications and features at the 5' and 3' ends, see Table 4. 



233 



wo 2004/080406 



PCT/US2004/007070 



Table 4 

Targeted ORF's 

5 The23mer: AAGTTTGGGCAGCTCTGCTCT (SBQ ID NO: 6708) 

19 The23mer: AAGTGGACATACCGATTGACT (SEQ ID 210:6709) 
25 The23mer: AACTCAGAACTGAAGGGCGCC (SEQ ID NO 1 6710) 
27 The23raei-: AAGGGCGCCCAGACATGAAAA (SEQ ID NO: 6711) 
3'-trTR (beginning at 645) 

42: AAGATAGGAGCCCAGCTTCGA (SEQ ID NO: 6712) 

Class I 

21-nt iRNAs, t, deoxythymidine; p, phosphate 

pGUUUGGGCAGCyCUGCUCUtt (SEQ ID NO: 6712) #1 
pAGAGCAGAGCUGCCCAAACtt (SEQ ID NO: 6713) 

pGUGGACAUACCGAUUGACUtt (SEQ ID NO: 6714) #2 
pAGUCAAUCGGUAUGUCCACtt (SEQ ID NO: 6715) 

pCUCAOAACUGAAGGGCGCCtt (SEQ ID NO: 6716) #3 
pGGCGCCCUUCAGUUCUQAGtt (SEQ ID NO: 6717) 

pGAUAGGAGCCCAGCUUCGAtt (SBQ ID NO: 6718) #4 
pUCGAAGCUGQGCUCCUAUCtt (SEQ ID NO 1 6719) 
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Class n 

21 -nt iRNAs, t, deoxythymidine; p, phosphate; ps, thiophosphate 

5 pGUUUGGGCAGCUCUGCUCpsUpstpst (SEQ ID NOi6720> #11 

pAGAGCAGAGCUGCCCAAApsCpstpst (SEQ ID NO; 6721) 

pGUGGACAUACCGAUUGACpsUpstpst (SEQ ID NOs6722) #13 

pAGUCAAUCGGUAUGUCCApsCpstpst (SEQ ID N08 6723) 

10 

pCUCAGAACUGAAGGGCGCpsCpstpst (SEQ ID NO: 6724) #15 

pGGCGCCCUUCAGUUCUGApsGpstpst (SEQ ID NO: 6725) 

pGAUAGGAGCCCAGCUUCGpsApstpst (SEQ ID NO: 6726) #17 
15 pUCGAAGCUGGGCUCCUAUpsCpstpst (SEQ ID NOi6727) 

Class ffl 

23-nt antisense, 21 -nt sense, blunt-ended 5 -as 

20 GUUUGGGCAGCUCUGCUCUCU (SEQ ID NO:6728) #19 

AGAGAQCAGAGCUGCCCAAACUU (SEQ ID NO 1 6729) 

QUGGACAUACCGAXJUGACUGA (SEQ ID NO f 6730) #21 
UCAGUCAAUCGGUAUGUCCACUU (SEQ ID NO « 6731) 

25 

CUCAGAACUGAAGGGCGCCCA (SEQ ID NOi6732) #23 
PUGGGCGCCCUUCAGUUCUGAGUU (SEQ ID NO: 6733) 

GAUAGGAGCCCAGCXJUCGAGU (SEQ ID NO x 6734) #25 
30 ACUCGAAGCUGGGCUCCUAUCUU (SEQ ID NO: 6735) 



Class I dsRNAs consisted of 21 nucleotide paired sense and antisense strands. The 

35 sense and antisense strands were each phosphorylated at their 5' ends. The double stranded 
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region was 19 nucleotides long and consisted of ribonucleotides. The 3' end of each strand 
created a two nucleotide overhang consisting of two deoxyribonucleotide thymidines. See 
constructs #1-4 in Table 4. 

Class II dsRNAs were also 21 nucleotides long, with a 19 nucleotide double strand 
region. The sense and antisense strands were each phosphorylated at their 5' ends. Thethiee 
3' terminal nucleotides of the sense and antisense strands were phosphorothioate 
deoxyribonucleotides, and the two terminal phosphorothioate thymidines were unpaired, 
creating a 3' overhang region at each end of the iRNA molecule. See constructs 1 1, 13, 15, 
and 17 in Table 4. 

Class III dsRNAs included a 23 ribonucleotide antisense strand and a 
21 ribonucleotide sense strand, to form a construct having a blunt 5'and a 3' overhang region. 
See constructs 19, 21, 23, and 25 in Table 4. 

Within each of the three classes of iRNAs, the four dsRNA molecules were designed 
to target four different regions of the ApoM transcript dsRNAs 1 , 1 1, and 19 targeted the 5' 
end of the open reading frame (ORF). dsRNAs 2, 13, and 21, and 3, 15, and 23, targeted two 
internal regions (one 5' proximal and one 3' proximal) of the ORF, and the 4, 17, and 25 
iRNA constructs targeted to a region of the 3' untranslated sequence (3' UTS) of the ApoM 
mRNA. This is summarized in Table 5. 



Table 5. iRNA molecules targeted to mouse ApoM 



Class I 



Class n 



Class III 



iRNA targeted 
to 5' end of 
ORF 



1 



11 



19 



iRNA targeted 
to middle ORF 
(5* proxunal) 



13 



21 



iRNA targeted 
to middle ORF 
(3' proximal) 



15 



23 



iRNA targeted 
to 3 'UTS 



17 



25 



CDl mice (6-8 weeks old, ~35g) were administered one of the test iRNAs in PBS 
solution. Two hundred micrograms of iRNA in a volume of solution equal to 10% body 
weight ('-5.7mg iRNA/kg mouse) was administered by the method of high pressure tail vein 
injection, over a 10-20 sec. tune interval. After a 24h recovery period, a second injection 
was performed usmg the same dose and mode of admmistration as the first injection, and 
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following another 24h, a third and final injection was administered, also using the same dose 
and mode of administration. After a final 24h recovery, the mouse was sacrificed, serum waj 
collected and the liver and kidney harvested to assay for an affect on ApoM gene expression. 
Expression was monitored by quantitative RT-PCR and Western blot analyses. This 
experiment was repeated for each of the iRNAs listed m table 4. 

Class I iRNAs did not alter ApoM RNA levels in mice, as indicated by quantitative 
RT-PCR. This is in contrast to the effect of these iRNAs in cultured HepG2 cells. Cells 
cotransfected with a plasmid expressing exogenous ApoM RNA under a CMV promoter and 
a class I iRNA demonstrated a 25% or greater reduction in ApoM RNA concentrations as 
compared to control transfections. The iRNA molecules 1, 2 and 3 each caused a 75% 
decrease in exogenous ApoM mRNA levels. 

Class n iRNAs reduced liver and kidney ApoM mRNA levels by -30-85%. The iRNA 
molecule "13" elicited the most dramatic reduction in mRNA levels; quantitative RT-PCR 
indicated a decrease of about 85% in Uver tissue. Serum ApoM protein levels were also 
reduced as was evidenced by Western blot analysis. The iRNAs 1 1, 13 and 15, reduced 
protein levels by about 50%, while iRNA 17 had the mildest effect, reducing levels only by 
-15-20%. 

Class m iRNAs (constructs 19, 21, and 23) reduced serum Apo levels by -40-50%. 
To determine the effect of dosage on iRNA mediated ApoM inhibition, the 
experiment described above was repeated with three injections of 50tig iRNA "11" 
(-1 .4mg iRNA/kg mouse). This lower dosage of iRNA resulted ia a reduction of serum 
ApoM levels of about 50%. This is compared with the reduction seen with the 200pg 
injections, which reduced serum levels by 25-45%. These results indicated the lower 
dosage amounts of iRNAs were effective. 

In an effort to mcrease iRNA uptake by cells, iRNAs were precomplexed witihi 
lipofectamine prior to tail vein injections, ApoM protein levels were about 50% of wildtype 
levels in mice injected with iRNA "11" when the molecules were preincubated with 
lipofectamine; ApoM levels were also about 50% of wUdtype when mice were injected with 
iRNA "11" that was not precomplexed witti lipofectamine. 
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These experiments revealed that modified iRNAs can greatly influence RNAi- 
mediated gene silencing. As demonstrated herem, modifications including phosphorothioate 
nucleotides are particularly elBfective at decreasing target protein levels. 



5 

Example 2; apoB protein as a therapeutic target for lipid-based diseases 

Apolipoprotein B (apoB) is a candidate target gene for the development of novel 

therapies for lipid-based diseases. 

Methods described herein can be used to evaluate the efficacy of a particular siRNA 
10 as a therapeutic tool for treating lipid metabolism disorders resulting elevated apoB levels. 

Use of siRNA duplexes to selectively bind and inactivate the target apoB mRNA is an 

approach totreat these disorders. 
Two approaches: 

i) Inhibition of apoB in ex-vivo models by transfecting siRNA duplexes homologous 
15 to human apoB mRNA in a human hepatoma cell line (Hep G2) and monitor the level of the 

protein and the RNA using the Western blotting and RT-PCR methods, respectively. siRNA 
molecules that efficiently inhibit apoB expression will be tested for smiilar efifects in vivo. 

ii) In vivo trials using an apoB transgenic mouse model (apoBlOO Transgenic Mice, 
C57BL/6NTac-TgN (APOBlOO), Order Model #'s:1004-T (hemizygotes), B6 (control)). 

20 siRNA duplexes are designed to target apoB-1 00 or CETP/apoB double transgenic mice 
which express both cholesteryl ester transfer protein (CETP) and apoB. The effect of the 
siRNA on gene expression in vivo can be measured by monitoring the HDL/LDL cholesterol 
level in serum. The results of these experiments would indicate the therapeutic potential of 
siRNAs to treat lipid-based diseases, including hypercholesterolemia, HDL/LDL cholesterol 

25 imbalance, familial combined hyperlipidemia, and acquired hyperlipidemia. 



Background Fats, in the form of triglycerides, are ideal for energy storage because they are 
highly reduced and anhydrous. An adipocyte (or fat cell) consists of a nucleus, a cell 
membrane, and triglycerides, and its function is to store triglycerides. 
30 The lipid portion of the human diet consists largely of triglycerides and cholesterol 

(and its esters). These must be emulsified and digested to be absorbed. Specifically, fats 
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(triacylglycerols) are ingested. Bile (bile acids, salts, and cholesterol), which is made in the 
liver, is secreted by the gall bladder. Pancreatic lipase digests the triglycerides to fatty acids, 
and also digests di-, and mono-acylglycerols, which are absorbed by intestinal epithelial cells 
and then are resynthesized into triacylglycerols once inside the cells. These triglycerides and 
some cholesterols are combined with apolipoproteins to produce chylomicrons. 
Chylomicrons consist of approximately 95% triglycerides. The chylomicrons transport fatty 
acids to peripheral tissues. Any excess fat is stored in adipose tissue. 

Lipid transport and clearance from the blood into cells, and from the cells into the 
blood and the liver, is mediated by the lipoprotein transport proteins. This class of 
approxunately 17 proteins can be divided mto three groups: Apolipoproteins, lipoprotein 
processing proteins, and lipoprotein receptors. 

Apolipoproteins coat lipoprotein particles, and include the A-I, A-II, A-IV, B, CI, 
Cn, cm, D, E, Apo(a) proteins. Lipoprotem processing proteins mclude lipoprotein lipase, 
hepatic lipase, lecithm cholesterol acyltransferase and cholesterol ester transfer protein. 
Lipoprotem receptors include the low density lipoprotein (LDL) receptor, chylomicron- 
remnant receptor (the LDL receptor like protein or LDL receptor related protein - LRP) and 
the scavenger receptor. 

Lipoprotem Metabolism Since the triglycerides, cholesterol esters, and cholesterol absorbed 
into the small intestine are not soluble in aqueous medium, they must be combined with 
suitable proteins (apolipoproteins) in order to prevent them from forming large oil droplets. 
The resulting lipoproteins undergo a type of metabolism as they pass through the 
bloodstream and certain organs (notably the liver). 

Also synthesized m the liver is high density lipoprotein (HDL), which contains the 
apoproteins A-1, A-2, C-1, and D; HDL collects cholesterol from peripheral tissues and 
blood vessels and returns it to the liver. LDL is taken up by specific cell surface receptors 
into an endosome, which fuses with a lysosome where cholesterol ester is converted to free 
cholesterol. The apoproteins (including apo B-lOO) are digested to amino acids. The 
receptor protein is recycled to the cell membrane. 

The free cholesterol formed by this process has two fates. First, it can move to the 
endoplasmic reticulum (ER), where it can inhibit HMG-CoA reductase, the synthesis of 
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HMG-CoA reductase, and the synthesis of cell surface receptors for LDL. Also in the ER, 
cholesterol can speed up the degradation of HMG-CoA reductase. The free cholesterol can 
also be converted by acyl-CoA and acyl transferase (ACAT) to cholesterol esters, which 
form oil droplets. 

ApoB is the major apolipoprotein of chylomicrons of very low density lipoproteins 
(VLDL, which carry most of the plasma triglyceride) and low density lipoprotein (LDL, 
which carry most of the plasma cholesterol). ApoB exists in human plasma in two isoforms, 
apoB-48 and apoB-100. 

ApoB-100 is the major physiological ligand for the LDL receptor. The ApoB 
precursor has 4563 amino acids, and the mature apoB-100 has 4536 amino acid residues. The 
LDL-binding domain of ApoB-100 is proposed to be located between residues 3129 and 
3532. ApoB-100 is synthesized in the liver and is required for the assembly of very low 
density lipoproteins VLDL and for the preparation of apoB-100 to transport triglycerides 
(TG) and cholesterol from the liver to other tissues. ApoB-100 does not interchange between 
lipoprotein particles, as do the other lipoproteins, and it is found in IDL and LDL particles. 
After the removal of apolipoproteins A, E and C, apoB is incorporation into VLDL by 
hepatocytes. ApoB-48 is present in chylomicrons and plays an essential role in the intestinal 
absorption of dietary fats. ApoB-48 is synthesized in the small intestine. It comprises the N- 
terminal 48% of apoB-100 and is produced by a posttranscriptional apoB-100 mRNA editing 
event at codon 2153 (C to U). This editing event is a product of the apoBEC-lb enzyme, 
which is expressed in the intestine. This editing event creates a stop codon instead of a 
glutamine codon, and therefore apoB-48, instead of apoB-100 is expressed in the intestine 
(apoB-100 is expressed in the liver). 

There is also strong evidence that plasma apoB levels may be a better hidex of the 
risk of coronary artery disease (CAD) than total or LDL cholesterol levels. Clinical studies 
have demonstrated the value of measuring apoB in hypertriglyceridemic, 
hypercholesterolemic and normalipidemic subjects. 
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Table 6. Reference Range Lipid level in the Blood 



Lipid 


Range (mmols/ L) 


Plasma Cholesterol 


3.5-6.5 


Low density lipoprotein 


1.55-4.4 


Very low density lipoprotein 


0.128-0.645 


High density lipoprotein/ triglycerides 


0.5-2.1 


Total lipid 


4.0-lOg/L 



Molecular genetics of lipid metabolism in both humans and induced mutant mouse models 
5 Elevated plasma levels of LDL and apoB are associated with a higher risk for atherosclerosis 
and coronary heart disease, a leading cause of mortality. ApoB is the mandatory constituent 
of LDL particles. In addition to its role in lipoprotein metabolism, apoB has also been 
implicated as a factor in male infertility and fetal development. Furthermore, two 
quantitative trait loci regulating plasma apoB levels have been discovered, through the use of 

10 transgenic mouse models. Future experiments will facilitate the identification of human 
orthologous genes encoding regulators of plasma apoB levels. These loci are candidate 
therapeutic targets for human disorders characterized by altered plasma apoB levels. Such 
disorders include non-apoB linked hypobetalipoproteinemia and familial combined 
hyperlipidemia. The identification of these genetic loci would also reveal possible new 

15 pathways involved in the regulation of apoB secretion, potentially providing novel sites for 
pharmacological therapy. 

Diseases and Clinical Pharmacology Familial combined hyperlipemia (FCHL) affects an 
estimated one in 10 Americans. FCHL can cause premature heart disease. 

20 Familial Hypercholesterolemia (I'^igh level of apo B) A common genetic disorder of lipid 
metabolism. Familial hypercholesterolemia is characterized by elevated serum TC in 
association with xanthelasma, tendon and tuberous xanthomas, accelerated atherosclerosis, 
and early death from myocardial infarction (MI). It is caused by absent or defective LDL 
cell receptors, resulting in delayed LDL clearance, an increase in plasma LDL levels, and an 

25 accumulation of LDL cholesterol in macrophages over joints and pressure points, and in 
blood vessels. 
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Atherosclerosis (high level ofapo B) Atherosclerosis develops as a deposition of cholesterol 
and fat in the arterial wall due to disturbances in lipid transport and clearance from the blood 
into cells and from the cells to blood and the liver. 
5 Clinical studies have demonstrated that elevation of total cholesterol (TC), low- 

density lipoprotem cholesterol (LDL-C) and apoB-100 promote human atherosclerosis. 
Similarly, decreased levels of high - density lipoprotein cholesterol (HDL-C) are associated 
with the development of atherosclerosis. 

ApoB may be factor in the genetic cause of high cholesterol. 

10 The risk of coronary artery disease (CAD) (high level of apo B) Cardiovascular disease, 
including coronary heart disease and stroke, is a leading cause of death and disability. The 
major risk factors include age, gender, elevated low-density lipoprotein cholesterol blood 
levels, decreased high-density lipoprotein cholesterol levels, cigarette smoking, hypertension, 
and diabetes. Emerging risk factors include elevated lipoprotein (a), remnant lipoproteins, 

15 and C reactive protein. Dietary intake, physical activity and genetics also impact 
cardiovascular risk. Hypertension and age are the major risk factors for stroke. 

Abetalipoproteinemia, an inherited human disease characterized by a near-complete 
absence of apoB-containing lipoproteins in the plasma, is caused by mutations in the gene for 
microsomal triglyceride transfer protein (MTP). 

20 

Model for human atherosclerosis (Lipoprotein A transgenic mouse) Numerous studies have 
demonstrated that an elevated plasma level of lipoprotein(a) (Lp(a)) is a major independent 
risk factor for coronary heart disease (CHD). Current therapies, however, have little or no 
effect on apo(a) levels and the homology between apo(a) and plasminogen presents barriers 

26 to drug development. Lp(a) particles consist of apo(a) and apoB-100 proteins, and they are 
found only in primates and the hedgehog. The development of LPA transgenic mouse 
requires the creation of animals that express both human apoB and apo(a) transgenes to 
achieve assembly of LP(a). An atherosclerosis mouse model would facilitate the study of 
the disease process and factors mfluencing it, and further would facilitate the development of 

30 therapeutic or preventive agents. There are several strategies for gene-oriented therapy. For 
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example, the missing or non-functional gene can be replaced, or unwanted gene activity can 
be inhibited. 

Model for lipid Metabolism and Atherosclerosis DNX Transgenic Sciences has 
5 demonstrated that both CETP/ApoB and ApoB transgenic mice develop atherosclerotic 
plaques. 

Model for apoB-lOO overexpression The apoB-100 transgenic mice express high levels of 
human apoB-IOO. They consequently demonstrate elevated serum levels of LDL cholesterol. 
1 0 After 6 months on a high-fat diet, the mice develop significant foam cell accumulation under 
the endothelium and within the media, as well as cholesterol crystals and fibrotic lesions. 

Model for Cholesteryl ester transfer protein over expression The apoB-100 transgenic mice 
express the human enzyme, CETP, and consequently demonstrate a dramatically reduced 
16 level of serum HDL cholesterol. 

Model for apoB-lOO and CETP overexpression JTie apoB-100 transgenic mice express both 
CETP and apoB-100, resulting in mice with a human like serum HDL/LDL distribution. 
Following 6 months on a high-fat diet these mice develop significant foam cell accumtdation 
20 underlying the endotheliimoi and within the media, as well as cholesterol crystals and fibrotic 
lesions. 

ApoBlOO Transgenic Mice (Order Model #'s:1004''T (liemizygotes), B6 (control)) 
These mice express high levels of human apoB-100, resulting in mice with elevated serum 
25 levels of LDL cholesterol. These mice are useful in identifying and evaluating compounds to 
reduce elevated levels of LDL cholesterol and the risk of atherosclerosis. When fed a high 
fat cholesterol diet, these mice develop significant foam cell accumulation underly the 
endothelium and within the media, and have significantly more complex atherosclerotic 
lesions than control animals. 
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Double Transgenic Mice, CETP/ApoBlOO (Order Model #: lOOJ-TT) These mice express 
both CETP and apoB-100, resulting in a human-like serum HDL/LDL distribution. These 
mice are useful for evaluating compounds to treat hypercholesterolemia or HDL/LDL 
cholesterol imbalance to reduce the risk of developing atherosclerosis. When fed a high fat 
high cholesterol diet, these mice develop significant foam cell accumulation underlying the 
endothelium and within the media, and have significantly more complex atherosclerotic 
lesions than control animals. 

ApoE gene knockout mouse Homozygous apoE knockout mice exhibit strong 
hypercholesterolemia, primarily due to elevated levels of VLDL and IDL caused by a defect 
in lipoprotein clearance firom plasma. These mice develop atherosclerotic lesions which 
progress with age and resemble human lesions (Zhang etaL, Science 258:46-71, 1992; 
Plump et al, Cell 71:343-353, 1992; Nakashima et al, Arterioscler Thromp, 14:133-140, 
1994; Reddick et aL, Arterioscler Tromb. 14: 141-147, 1994). These mice are a promising 
model for studying the effect of diet and drugs on atherosclerosis. 

Low density lipoprotein receptor (LDLR) mediates lipoprotein clearance from plasma 
through the recognition of apoB and apoE on the surface of lipoprotein particles. Humans, 
who lack or have a decreased number of the LDL receptors, have familial 
hypercholesterolemia and develop CHD at an early age. 

ApoE Knockout Mice (Order Model #: APOE-M) The apoE knockout mouse was created by 
gene targeting in embryonic stem cells to disrupt the apoE gene. ApoE, a glycoprotein, is a 
structural component of very low density lipoprotein (VLDL) synthesized by the liver and 
intestinally synthesized chylomicrons. It is also a constituent of a subclass of high density 
lipoproteins (HDLs) mvolved in cholesterol transport activity among cells. One of the most 
important roles of apoE is to mediate high affinity bindmg of chylomicrons and VLDL 
particles that contain apoE to the low density lipoprotein (LDL) receptor. This allows for the 
specific uptake of these particles by the liver which is necessary for transport preventing the 
accumulation in plasma of cholesterol-rich remnants. The homozygous inactivation of the 
apoE gene results in animals that are devoid of apoE in their sera. The mice appear to 
develop normally, but they exhibit five times the normal serum plasma cholesterol and 



244 



wo 2004/080406 



PCT/US2004/007070 



spontaneous atherosclerotic lesions. This is similar to a disease in people who have a variant 
form of the apoE gene that is defective in binding to the LDL receptor and are at risk for 
early development of atherosclerosis and increased plasma triglyceride and cholesterol 
levels. There are indications that apoE is also involved in immune system regulation, nerve 
6 regeneration and muscle differentiation. The apoE knockout mice can be used to study the 
role of apoE in lipid metabolism, atherogenesis, and nerve injury, and to investigate 
intervention therapies that modify the atherogenic process. 

Apoe4 Targeted Replacement Mouse (Order Model #; 001549-M) ApoE is a plasma protein 
involved in cholesterol transport, and the three human isoforms (E2, E3, and E4) have been 

10 associated with atherosclerosis and Alzheimer's disease. Gene targeting of 129 ES cells was 
used to replace the coding sequence of mouse apoE with human AP0E4 without disturbing 
the murine regulatory sequences. The E4 isofonn occurs in approximately 14% of the 
human population and is associated with increased plasma cholesterol and a greater risk of 
coronary artery disease. The Taconic apoE4 Targeted Replacement model has normal 

16 plasma cholesterol and triglyceride levels, but altered quantities of different plasma 
lipoprotein particles. This model also has delayed plasma clearance of cholesterol-rich 
lipoprotein particles (VLDL), with only half the clearance rate seen in the apoE3 Targeted 
Replacement model. Like the apoE3 model, the apoE4 mice develop altered plasma 
lipoprotein values and atherosclerotic plaques on an atherogenic diet. However, the 

20 atherosclerosis is more severe in the apoE4 model, with larger plaques and cholesterol apoE 
and apoB-48 levels twice that seen in the apoES model. The Taconic apoE4 Targeted 
Replacement model, along with the apoE2 and apoE3 Targeted Replacement Mice, provide 
an excellent tool for in vivo study of the human apoE isoforms. 

CETP Transgenic Mice (Order Model #: 1003-T) These animals express the human plasma 
26 enzyme, CETP, resulting in mice with a dramatic reduction in serum HDL cholesterol. The 
mice can be useful in identifying and evaluating compounds that increase the levels of HDL 
cholesterol for reducing the risk of developing atherosclerosis 

Transgene/Promoter: human apolipoprotein A-I These mice produce mouse HDL 
cholesterol particles that contain human apolipoprotein A-L Transgenic expression is life- 
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long in both sexes (Biochemical Genetics and Metabolism Laboratory, Rockefeller 
University, NY City). 

A Mouse Model for Abetalipoproteinemia Abetalipoproteinemia, an inherited human disease 
6 characterized by a near-complete absence of apoB-contaming lipoproteins in the plasma, is 
caused by mutations in the gene for microsomal triglyceride transfer protein (MTP). Gene 
targeting was used to knock out the mouse MTP gene {Mttp). In heterozygous knockout 
mice {Mttp^^'), the MTP mRNA, protein, and activity levels were reduced by 50% in both 
liver and intestine. Recent studies with heterozygous MTP knockout mice have suggested 

10 that half-normal levels of MTP in the liver reduce apoB secretion. They hypothesized that 
reduced apoB secretion in the setting of half-normal MTP levels might be caused by a 
reduced MTP:apoB ratio in the endoplasmic reticulum, which would reduce the number of 
apoB-MTP interactions. If this hypothesis were true, half-normal levels of MTP might have 
littie impact on lipoprotein secretion in the setting of half-normal levels of apoB synthesis 

15 (since the ratio of MTP to apoB would not be abnormally low) and might cause an 

exaggerated reduction in lipoprotein secretion in the setting of apoB overexpression (since 
the ratio of MTP to apoB would be even lower). To test this hypothesis, they examined the 
effects of heterozygous MTP deficiency on apoB metabolism in the setting of normal levels 
of apoB synthesis, half-normal levels of apoB synthesis (heterozygous ylj7o6 deficiency), and 

20 increased levels of apoB synthesis (transgenic overexpression of human apoB). Contrary to 
their expectations, half-normal levels of MTP reduced plasma apoB-100 levels to tiie same 
extent (-25-35%) at each level of apoB synthesis. In addition, apoB secretion firom primary 
hepatocjrtes was reduced to a comparable extent at each level of apoB synthesis. Thus, these 
results indicate that the concentration of MTP within the endoplasmic reticulum, rather than 

25 the MTPiapoB ratio, is the critical determinant of lipoprotein secretion. Finally, 

heterozygosity for an apoB knockout mutation was found to lower plasma apoB-100 levels 
more than heterozygosity for an MTP knockout allele. Consistent witii that result, hepatic 
triglyceride accumulation was greater in heterozygous apoB knockout mice than in 
heterozygous MTP knockout mice. CxdloxP tissue-specific recombination techniques were 

30 also used to generate liver-specific Mttp knockout mice. Inactivation of the Mttp gene in the 
liver caused a striking reduction in very low density lipoprotein (VLDL) triglycerides and 
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large reductions in both VLDL/iow density lipoproteins (LDL) and high density lipoprotein 
cholesterol levels. Histologic studies in liver-specific knockout mice revealed moderate 
hepatic steatosis. Currently being tested is the hypothesis that accumulation of triglycerides 
in the liver renders the liver more susceptible to injury by a second insult (e.g., 
lipopolysaccharide). 

Human apo B (apolipoprotein B) Transgene mice show apo B locus may have a causative 
role male infertility The fertility of apoB (apolipoprotein B) (+/-) mice was recorded during 
the course of backcrossing (to C57BL/6J mice) and test mating. No apparent fertility 
problem was observed in female apoB (+A) and wild-type female mice, as was documented 
by the presence of vaginal plugs in female mice. Although apoB (+A) mice mated normally, 
only 40% of the animals firom the second backcross generation produced any offspring 
within the 4-month test period. Of the animals that produced progeny, litters resulted from 
< 50% of documented matings. In contrast, all wild-type mice (6/6-/. e., 100%) tested were 
fertile. These data suggest genetic influence on the infertility phenotype, as a small number 
of male heterozygotes were not sterile. Fertilization in vivo was dramatically impaired in 
male apoB (+/-) mice. 74% of eggs examined were fertilized by the sperm from wild-type 
mice, whereas only 3% of eggs examined were fertilized by the sperm from apoB (+/-) mice. 
The sperm counts of apoB (4-/-) mice were mildly but significantly reduced compared with 
controls. However, the percentage of motile sperm was markedly reduced m the apoB (+/-) 
animals compared with that of the wild-type controls. Of the sperm from apoB (+/-) mice, 
20% (/.e., 4,9% of the initial 20% motile sperm) remained motile after 6 hr of incubation, 
whereas 45% {i.e., 33.6% of the initial 69.5%) of the motile sperm retained motility in 
controls after this time. In vitro fertilization yielded no fertilized eggs in three attempts with 
apo B (+/-) mice, while wild-type controls showed a fertilization rate of 53%. However, 
sperm from apoB (+/-) mice fertilized 84% of eggs once the zona pellucida had been 
removed. Numerous sperm from apoB (+/-) mice were seen binding to zona-intact eggs. 
However, these sperm lost their motility when observed 4-6 hours after binding, showing that 
sperm from apoB (+/-) mice were unable to penetrate the zona pellucida but that the 
interaction between sperm and egg was probably not direct. Sperm binding to zona-free 
oocytes was abnormal. In the apoB (+/-) mice, sperm binding did not attenuate, even after 



247 



wo 2004/080406 



PCTAJS2004/007070 



pronuclei had clearly formed, suggesting that apoB deficiency results in abnormal surface 
interaction between the sperm and egg. 

t 

Knockout of the mouse apoB gene resulted in embryonic lethality in homozygotes, 
protection against diet-induced hypercholesterolemia in heterozygotes, and developmental 
5 abnormalities in mice. 

Model of insulin resistance, dyslipidemia & overexpression of human apoB It was shown 
that the livers of apoB mice assemble and secrete increased numbers of VLDL particles. 

Eiieample 3. Treatment of Diabetes Type-2 with iRNA 

10 Introduction The regulation of hepatic gluconeogenesis is an important process in the 

adjustment of the blood glucose level Pathological changes in the glucose production of the 
liver are a central characteristic in type-2-diabetes. For example, the fasting hyperglycemia 
observed in patients with type-2-diabetes reflects the lack of inhibition of hepatic 
gluconeogenesis and glycogenolysis due to the underlying insulin resistance m this disease. 

15 Extreme conditions of insulin resistance can be observed for example in mice with a liver- 
specific insulin receptor knockout CLIRKO'). These mice have an increased expression of 
the two rate-limiting gluconeogenic enzymes, phosphoenolpyruvate carboxykinase (PEPCK) 
and the glucose-6-phosphatase catalytic subunit (G6Pase). Insulm is known to repress both 
PEPCK and G6Pase gene expression at the transcriptional level and the signal transduction 

20 mvolved in the regulation of G6Pase and PEPCK gene expression by insulin is only partly 
understood. While PEPCK is involved in a very early step of hepatic gluconeogenesis 
(synthesis of phosphoenolpyruvate from oxaloacetate), G6Pase catalyzes the terminal step of 
both, gluconeogenesis and glycogenolysis, the cleavage of glucose-6-phosphate into 
phosphate and fi:ee glucose, which is then delivered into the blood stream. 

25 The pharmacological intervention m the regulation of expression of PEPCK and 

G6Pase can be used for the treatment of the metabolic aberrations associated with diabetes. 
Hepatic glucose production can be reduced by an iRNA-based reduction of PEPCK and 
G6Pase enzymatic activity in subjects with type-2-diabetes. 
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Targets for iRNA 

Glucose-6-phospbatase (G6Pase) 

G6Pase mRNA is expressed principally in liver and kidney, and in lower amounts in 
the small intestine. Membrane-boimd G6Pase is associated with the endoplasmic reticulimi. 
5 Low activities have been detected in skeletal muscle and in astrocytes as well. 

G6Pase catalyzes the terminal step in gluconeogenesis and glycogenolysis. The 
activity of the enzyme is several fold higher in diabetic animals and probably in diabetic 
humans. Starvation and diabetes cause a 2-3-fold increase in G6Pase activity in the liver and 
a 2-4-fold increase in G6Pase mRNA. 

10 

Phospboenolpyruvate carboxykinase (PEFCK) 

Overexpression of PEPCK in mice results in symptoms of type-2-diabetes mellitus. 
PEPCK overexpression results in a metabolic pattern that increases G6Pase mRNA and 
results in a selective decrease in insului receptor substrate (IRS)-2 protein, decreased 
15 phosphatidylinositol 3-kinase activity, and reduced ability of insulin to suppress 
gluconeogenic gene expression. 



Table 7. Other targets to inhibit hepatic glucose production 



Target 


Comment 


FKHR 


good evidence for antidiabetic phenotype 
(Nakae et al, Nat Genetics 32:245(2002) 


Glucagon 




Glucagon receptor 




Glycogen phosphorylase 




PGC-1 (PPAR-Gamma 
Coactivator) 


regulates the cAMP response (and 
probably the PKB/FKHR-regulation) on 
PEPCK/G6Pase 


Fructose- 1 ,6-bisphosphatase 




Glucose-6-phospate translocator 




Glucokinase inhibitory 
regulatory protein 





20 

Materials and Methods 

Animals: BKS.Cg-m +/+ Lepr db mice, which contain a point mutation in the leptin receptor 
gene are used to examine the efficacy of iRNA for the targets listed above. 
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BKS.Cg-m +/+ Lepr db are available from the Jackson Laboratory (Stock Number 
000642). These animals are obese at 3-4 weeks after birth, show elevation of plasma msulin 
at 10 to 14 days, elevation of blood sugar at 4 to 8 weeks, and uncontrolled rise in blood 
sugar. Exogenous insulin fails to control blood glucose levels and gluconeogenic activity 
5 increases. 

The following numbers of male animals (age>12 weeks) would ideally be tested with 
the following iRNAs: 

PEPCK, 2 sequences, 5 animals per sequence 

G6Pase, 2 sequences, S animals per sequence 
10 1 nonspecific sequence, S animals 

1 control group (only injected, no siRNA), 5 animals 
1 control group (not mjected, no siRNA), 5 animals 

Reagents: Necessary reagents would ideally include a Glucometer Elite XL (Bayer, 
15 Pittsburgh, PA) for glucose quantification, and an Insulin Radioimmunoassay (RIA) kit 
(Amersham, Piscataway, NJ) for insulin quanitation 

Assays; 

G6P enzyme assays and PEPCK enzyme assays are used to measure the activity of the 
20 enzymes. Northern blotting is used to detect levels of G6Pase and PEPCK mRNA. 

Antibody-based techniques (e.g., inmiunoblotting, immunofluorescence) are used to detect 
levels of G6Pase and PEPCK protein. Glycogen staining is used to detect levels of glycogen 
in the liver. Histological analysis is performed to analyze tissues. 

25 Gene information: 

G6Pase GenBank® No.: NM_008061>Ius musculus glucose-6-phosphatase, catalytic 
(G6pc), mRNA 1..2259, ORF 83..1 156; 

GenBank® No: U00445,Mus musculus glucose-6-phosphatase mRNA, complete cds 
1..2259,ORF83..1156 
30 GenBank® No: BC013448 
PEPCK 
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GenBank® No: NM_01 1044, Mus musciilus phosphoenolpyruvate carboxykinase 1, 
cytosolic (Pckl), mRNA.1..2618, ORF 141..2009 
GenBank® No: AF009605. 1 

6 Administration of iRNA : 

iRNA corresponding to the genes described above would be administered to mice 
with hydrodynamic injection. One control group of animals would be treated with 
Metformin as a positive control for reduction in hepatic glucose levels. 

10 Experimental Protocol 

Mice would be housed in a facility in which there is light from 7:00 AM to 7:00 PM. 
Mice would be fed ad libidum from 7:00 PM to 7:00 AM and fast from 7:00 AM to 7:00 PM. 

DayO: 7:00 PM: Approximately 100 pJ blood would be drawn from the tail. Serum would 
be isolated to measure glucose, insulin, HbAlc (EDTA-blood), glucagon, FFAs, lactate, 
15 corticosterone, serum triglycerides. 

Day 1-7: Blood glucose would be measured daily at 8:00 AM and 6:00 PM (approx. 3-5 jil; 
measured with a Haemoglucometer) 

Day 8: Blood glucose would be measured daily at 8:00 AM and 6:00 PM. iRNA would be 
injected between 10:00 AM and 2:00 PM 

20 

Day 9-20: Blood glucose would be measured daily at 8:00 AM and 6:00 PM. 
Day 21: Mice would be sacrificed after 10 hours of fasting. 

Blood would be isolated. Glucose, insulin, HbAlc (EDTA-blood), glucagon, FFAs, lactate, 
25 corticosterone, serum triglycerides would be measured. Liver tissue would be isolated for 
histology, protein assays, RNA assays, glycogen quantitation, and enzyme assays. 
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Example 4; Inhibition of Glucose-6-Phosphatase iRNA in vivo 

iRNA targeted to the Glucose-6-Phosphatase (G6P) gene was used to examine the 
effects of inhibition of G6P expression on glucose metabolism in vivo. 
5 Female mice, 10 weeks of age, strain BKS.Cg-m +/+ Lepr db (The Jackson 

Laboratory) were used for m vivo analysis of enzymes of the hepatic glucose production. 
Mice were housed under conditions where it was light from 6:30 am to 6:30 pm. Mice were 
fed (ad libidum) during the night period and fasted during the day period. 
On day 1, approximately 100|jJ of blood was collected from test animals by puncturing the 
10 retroorbital plexus. On days 1-7, blood glucose was measured in blood obtained from tail 
veins (approximately 3-5 |j,l) using a Glucometer (Elite XL, Bayer). Blood glucose was 
sampled daily at 8 am and 6 pm. 

On day 7 at approximately 2pm, GL3 plasmid (10 ^ig) and siRNAs (100 \xg G6Pase 
specific, Renilla nonspecific or no siRNA control) were delivered to animals using 
1 5 hydrodynamic coinj ection. 

On day 8, GL3 expression was analyzed by injection of luceferin (3 mg) after 
anaesthesia with avertin and imaging. This was done to control for successful hydrodynamic 
delivery. 

On days 8-10, blood glucose was measured in blood obtained from tail veins 
20 (approximately 3-5 ml) using a Glucometer (Elite XL, Bayer). 

On day 10, mice were sacrificed after 10 hours of fasting. Blood and liver were 
isolated from sacrificed animals. 

Results: Coinjection of GL3 plasmid and G6Pase iRNA (G6P4) reduced blood 
glucose levels for the short term. Coinjection of GL3 plasmid and Renilla nonspecific iRNA 
25 had no effect on blood glucose levels. 



30 
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Example 5: Selected Palindromic Sequences 

Tables 8-13 below provide selected palindromic sequences from the following genes: human 
ApoB, human glucose-6-phosphatase, rat glucose-6-phosphatase, p-catenin, and hepatitis C 
virus (HCV). 
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Table 8. Selected palindromic sequences from human At>oB 





Source 


Start 
index 


End 
inaex 




Match 


Start 
in a ex 


End # 
inucA 


B 


SEQ ID NO: 1 


ggccattccaflaagggaag 


509 


528 


SEQ ID NO: 1004 


cttccgttctgtaatggcc 


5795 


5814 1 


9 


SEQ ID NO: 2 


tgccatctcgagagttcca 


4099 


4118 


SEQ ID NO: 1005 


tggaactctctccatggca 


10876 


10895 1 


8 


SEQ ID NO: 3 


catgtcaaacactttgtta 


7056 


7075 


SEQ ID NO: 1006 


taacaaattccttgacatg 


7358 


7377 1 


8 


SEQ ID NO: 4 


tttgttataaatcttattg 


7068 


7087 


SEQ ID NO: 1007 


caataagatcaatagcaaa 


8990 


9009 1 


8 


SEQ ID NO: 5 


tctggaaaagggtcatgga 


8880 


8899 


SEQ ID NO: 1008 


tccatgtcccatttacaga 


11356 


11375 1 


8 


SEQ ID NO: 6 


cagctcttgttcaggtcca 


10900 


10919 


SEQ ID NO: 1009 


tggacctgcaccaaagctg 


13952 


13971 1 


8 


SEQ ID NO; 7 


ggaggttccccagctctgc 


356 


375 


SEQ ID NO: 1010 


gcagccctgggaaaactcc 


6447 


6466 1 


7 


SEQ ID NO: 8 


ctgttttgaagactctcca 


1081 


1100 


SEQ ID NO: 1011 


tggagggtagtcataacag 


10327 


10346 1 


7 


SEQ ID NO: 9 


agtggctgaaacgtgtgca 


1297 


1316 


SEQ ID NO: 1012 


tgcagagctttctgccact 


13508 


13527 1 


7 


SEQ ID NO: 10 


ccaaaatagaaggg aatct 


2068 


2087 


SEQ ID NO: 1013 


agattcctttgccttttgg 


4000 


4019 1 


7 


SEQ ID NO: 11 


tgaagagaagattgaattt 


3620 


3639 


SEQ ID NO: 1014 


aaattctcttttcttttca 


9212 


9231 1 


7 


SEQ ID NO: 12 


agtggtggcaacaccagca 


4230 


4249 


SEQ ID NO: 1015 


tgctagtgaggccaacact 


10649 


10668 1 


7 


SEQ ID NO: 13 


aaggctccacaagtcatca 


5950 


5969 


SEQ ID NO: 1016 


tgatgatatctggaacctt 


10724 


10743 1 


7 


SEQ ID NO: 14 


gtcagccaggtttatagca 


7725 


7744 


SEQ ID NO: 1017 


tgctaagaaccttactgac 


7781 


7800 1 


7 


SEQ ID NO: 15 


tgatatctggaaccttgaa 


10727 


10746 


SEQ ID NO: 1018 


ttcactgttcctgaaatca 


7863 


7882 1 


7 


SEQ ID NO: 16 


gtcaagttgagcaatttct 


13423 


13442 


SEQ ID NO: 1019 


agaaaaggcacaccttgac 


11072 


11091 1 


7 


SEQ ID NO: 17 


atccagatggaaaagggaa 


13480 


13499 


SEQ ID NO: 1020 


ttccaatttccctgtggat 


3680 


3699 1 


7 


SEQ ID NO: 18 


atttgtttgtcaaagaagt 


4543 


4562 


SEQ ID NO: 1021 


acttcagagaaatacaaat 


11401 


11420 4 


6 


SEQ ID NO: 19 


ctggaaaatgtcagcctgg 


204 


223 


SEQ ID NO: 1022 


ocagacttccgtttaccag 


8235 


8254 2 


6 


SEQ ID NO: 20 


accaggaggttcttcttca 


1729 


1748 


SEQ ID NO: 1023 


tgaagtgtagtctcctggt 


5089 


5108 2 


6 


SEQ ID NO: 21 


aaagaagttctgaaagaat 


1956 


1975 


SEQ ID NO: 1024 


attccatcacaaatccttt 


9661 


9680 2 


6 


SEQ ID NO: 22 


gctacagcttatggctcca 


3570 


3589 


SEQ ID NO: 1025 


tggatctaaatgcagtagc 


11623 


11642 2 


6 


SEQ ID NO: 23 


atcaatattgatcaatttg 


6414 


6433 


SEQ ID NO: 1026 


caaagaagtcaagattgat 


4553 


4572 2 


6 


SEQ ID NO: 24 


gaattatcttttaaaacat 


7326 


7345 


SEQ ID NO: 1027 


atgtgttaacaaaatattc 


11494 


11513 2 


6 


SEQ ID NO: 25 


cgaggcccgcgctgctggc 


130 


149 


SEQ ID NO: 1028 


gccagaagtgagatcctcg 


3507 


3526 1 


6 


SEO ID NO* 26 


acaactatgaggctgagag 


271 


290 


SEQ ID NO: 1029 


ctctaaacaacaaatttQt 


10309 10328 1 


6 


SEQ ID NO: 27 


gctgagagttccagtggag 


282 


301 


SEQ ID NO: 1030 


ctccatggcaaatgtcagc 


10885 


10904 1 


6 


SEQ ID NO: 28 


tgaagaaaaccaagaactc 


448 


487 


SEQ ID NO: 1031 


gagtcatlgaggttcttca 


4929 


4948 1 


6 


SEQ ID NO: 29 


cctacttacatcctgaaca 


558 


577 


SEQ ID NO: 1032 


tgttcataagggaggtagg 


12766 


12785 1 


6 


SEQ ID NO: 30 


ctacttacatcctgaacat 


559 


578 


SEQ ID NO: 1033 


atgttcataagggaggtag 


12765 12784 1 


6 


SEQ ID NO: 31 


gagacagaagaagccaagc 61 5 


634 


SEQ ID NO: 1034 


gcttggttttgccagtctc 


2459 


2478 1 


6 


SEQ ID NO: 32 


cactcactttaccgtcaag 


671 


690 


SEQ ID NO: 1035 


cttgaacacaaagtcagtg 


6000 


6019 1 


6 


SEQ ID NO: 33 


ctgatcagcagcagccagt 


822 


841 


SEQ ID NO: 1036 


actgggaagtgcttatcag 


5237 


5256 1 


8 


SEQ ID NO: 34 


actggacgctaagaggaag 


854 


873 


SEQ ID NO: 1037 


cttccccaaagagaccagt 


2890 


2909 1 


6 
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SEQ ID NO: 


35 


agaggaagcatgtggcaga 


865 


884 


SEQ ID NO: 1038 


tctggcatttactttctct 


5921 


5940 


1 


6 


SEQ ID NO: 


36 


tgaagactctccaggaact 


1087 


1106 


SEQ ID NO: 1039 


agttgaaggaaactattca 


7216 


7235 


1 


6 


SEQ ID NO: 


37 


ctctgagcaaaatatccag 


1121 


1140 


SEQ ID NO: 1040 


ctggttactgagctgagag 


1161 


1180 


1 


6 


SEQ ID NO: 


38 


atgaagcagtcacatctct 


1189 


1208 


SEQ ID NO: 1041 


agagctgccagtccttcat 


10016 


10035 1 


6 


SEQ ID NO: 


39 


ttg ccacagctgattgagg 


1209 


1228 


SEQ ID NO: 1042 


cctcctacagtggtggcaa 


4222 


4241 


1 


6 


SEQ ID NO: 


40 


agctgattgaggfgtccag 


1216 


1235 


SEQ ID NO: 1043 


ctggattccacatgcagct 


11847 


11866 1 


6 


SEQ ID NO: 


41 


tgctccactcacatcctcc 


1278 


1297 


SEQ ID NO: 1044 


ggaggctttaagttcagca 

WW WW W W 


7601 


7620 


1 


6 


SEQ ID NO: 


42 


tgaaacgtgtgcatgccaa 


1303 


1322 


SEQ ID NO: 1045 


ttgggagagacaagtttca 


6500 


6519 


1 


6 


SEQ ID NO: 


43 


gacattgctaattacctga 


1603 


1522 


SEQ ID NO: 1046 


tcagaagctaagcaatgtc 


7232 


7251 


1 


6 


SEQ ID NO: 


44 


ttcttcttcagactttcct 


1738 


1757 


SEQ ID NO: 1047 


aggagagtccaaattagaa 


8498 


8517 


1 


6 


SEQ ID NO: 


45 


ccaatatcttgaactcaga 


1903 


1922 


SEQ ID NO: 1048 


tctgaattcattcaattgg 


6485 


6504 


1 


6 


SEQ ID NO: 


46 


aaagttagtgaaagaagtt 


1946 


1965 


SEQ ID NO: 1049 


aactaccctcactgccttt 


2132 


2161 


1 


6 


SEQ ID NO: 


47 


aagttagtgaaagaagttc 


1947 


1966 


SEQ ID NO: 1050 


gaacctctggcatttactt 


5916 


5935 


1 


6 


SEQ ID NO: 


48 


aaagaagttctgaaagaat 


1956 


1975 


SEQ ID NO: 1051 


attctctggtaactacttt 


5482 


5501 


1 


6 


SEQ ID NO: 


49 


tttggctataccaaagatg 


2322 


2341 


SEQ ID NO; 1052 


catcttaggcactgacaaa 


4997 


5016 


1 


6 


SEQ ID NO: 


50 


tgttgagaagctgattaaa 


2381 


2400 


SEQ ID NO: 1053 


tttagccatcggctcaaca 


5700 


5719 


1 


6 


SEQ ID NO: 


51 


caggaagggctcaaagaat 


2561 


2580 


SEQ ID NO: 1054 


attcctttaacaattcctg 


9492 


9511 


1 


6 


SEQ ID NO: 


52 


aggaagggctcaaagaatg 


2562 


2581 


SEQ ID NO: 1055 


cattcctttaacaattcct 


9491 


9510 


1 


6 


SEQ ID NO: 


53 


gaagggctcaaagaatgac 


2564 


2583 


SEQ ID NO: 1056 


gtcagtcttcaggctcttc 


7914 


7933 


1 


6 


SEQ ID NO: 


54 


caaagaatgacttttttct 


2572 


2591 


SEQ ID NO: 1057 


agaaggatggcattttttg 


14000 


14019 1 


6 


SEQ ID NO: 


55 


catggagaatgcctttgaa 


2603 


2622 


SEQ ID NO: 1058 


ttcag agccaaagtcx:atg 


7119 


7138 


1 


6 


SEQ ID NO: 


56 


ggagccaaggctggagtaa 


2679 


2698 


SEQ ID NO: 1059 


ttactccaacgccagctcc 


3050 


3069 


1 


6 


SEQ ID NO: 


57 


tcattccttccccaaagag 


2884 


2903 


SEQ ID NO: 1060 


ctctctggggcatctatga 


5139 


5158 


1 


6 


SEQ ID NO: 


58 


acctatgagctccagagag 


3165 


3184 


SEQ ID NO: 1061 


ctctcaagaccacagaggt 


12976 


12995 1 


6 


SEQ ID NO: 


59 


gggcaaaacgtcttacaga 


3365 


3384 


SEQ ID NO: 1062 


tctgaaagacaacgtgccc 


12317 


12336 1 


6 


SEQ ID NO: 


60 


accctggacattcagaaca 

WW w 


3387 


3406 


SEQ ID NO: 1063 


tgttgctaaggttcagggt 

w w WW www 


6675 


5694 


1 


6 


SEQ ID NO: 


61 


atgggcgacctaagttgtg 


3429 


3448 


SEQ ID NO: 1064 


cacaaattagtttcaccat 


8941 


8960 


1 


6 


SEQ ID NO: 


62 


gatgaagagaagattgaat 


3618 


3637 


SEQ ID NO: 1065 


attccagcttccccacatc 


8330 


8349 


1 


6 


SEQ ID NO: 


63 


caatgtagataccaaaaaa 


3656 


3675 


SEQ ID NO: 1066 


ttttttggaaatgccattg 


8643 


8662 


1 


6 


SEQ ID NO* 


64 


M VCIM CI tOvwCICICigCICTCI LM U 


3660 

wwW w 


3679 


SEO ID NO* 1067 




4371 


4390 


1 


6 
\j 


SEQ ID NO: 


65 


gcttcagttcatttggact 


4509 


4528 


SEQ ID NO: 1068 


agtcaagaaggacttaagc 


5304 


5323 


1 


6 


SEQ ID NO: 


66 


tttgtttgtcaaagaagtc 


4544 


4563 


SEQ ID NO: 1069 


gacttcagagaaatacaaa 


11400 


11419 1 


6 


SEQ ID NO: 


67 


ttgtttgtcaaagaagtca 


4545 


4564 


SEQ ID NO: 1070 


tgacttcagagaaatacaa 


11399 


11418 1 


6 


SEQ ID NO: 


68 


tggcaatgggaaactcgct 


5846 


5865 


SEQ ID NO: 1071 


agcgagaatcaccctgcca 


8219 


8238 


1 


6 


SEQ ID NO: 


69 


aacctctggcatttacttt 


5917 


5936 


SEQ ID NO: 1072 


aaaggagatgtcaagggtt 


10599 


10618 1 


6 


SEQ ID NO: 


70 


catttactttctctcatga 


5926 


5945 


SEQ ID NO: 1073 


tcatUgaaagaataaatg 


7026 


7045 


1 


6 


SEQ ID NO: 


71 


aaagtcagtgccctgctta 


6009 


6028 


SEQ ID NO: 1074 


taagaaccttactgacttt 


7784 


7803 


1 


6 
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ocQ ID NO. 72 




6322 


6341 


SEQ ID NO: 1075 


aaggacttcaggaatggga 


12004 12023 1 


6 


ocU ID NO: 73 


cdtcaataitgatcaaut 


OA ^ O 

6413 


6432 


SEQ ID NO: 1076 


aaattaaaaagtcttgatg 


Of 


O/Ol 1 


A 

6 


ocQ lU NO: 74 




6doo 


6584 


SEQ ID NO: 1077 


taaaccaaaacttggttta 


9019 


9038 1 


A 

6 


ocU lU NO: 75 


tattgatgaaatcattgaa 


6713 


6732 


SEQ ID NO: 1078 


ttcaaagacttaaaaaata 


8007 


8026 1 


A 

6 


ceo ir\ MO* "7e 
OCQ IP NO: 76 


atgatciacatttgtttat 


6790 


6809 


SEQ ID NO: 1079 


ataaagaaattaaagtcat 


7380 


7399 1 


6 


ScQ ID NO: 77 


agagacacatacagaatat 


6919 


AAA n 

6936 


SEQ ID NO: 1080 


atatattgtcagtgcctct 


13382 


13401 1 


6 


OcQ ID NO: 7o 


gacacatacagaatataga 


6922 


A A /> .4 

6941 


SEQ ID NO: 1081 


tctaaattcagttcttgtc 


11327 


11346 1 


6 


ocU ID NO: 79 


agcatgtcaaacactttgt 


7054 


7073 


1 ^> Ek,i^^. 4i#%n#% 

SEQ ID NO: 1082 


acaaagtcagtgccctgct 


6007 


6026 1 


6 


oco ir^ MO. OA 
ocQ ID NO: oO 


tttttagaggaaaccaagg 


7515 


7534 


SEQ ID NO: 1083 


Al A • 

cctttgtgtacaccaaaaa 


11230 


11249 1 


6 


ceo in MO* QA 

obU ID no: o1 


ttitagagg ssdccaaggc 


7516 


7535 


SEQ ID NO: 1084 


gcctttgtgtacaccaaaa 


11229 11248 1 


6 


ceo irv MO. Qo 
ocU ID NO. o2 


ggaagatagacttcctgaa 


9307 


AA AA 

9326 


SEQ ID NO: 1085 


ttcagaaatactgttttcc 


12824 


12843 1 


6 


ceo in Kirk- uix 


cacigiucigagicccag 


Voo4 


9353 


ceo in KIO. 4Aoe 
obU ID NO: 1086 


ctgggacctaccaagagtg 


12523 


12542 1 


6 


ceo in MO* a/i 
OCW ID i>IO. W 


cacaaatccttiggctgig 


9dO0 


9oo7 


ocQ ID NO: 1067 


cacatttcaaggaattgtg 


10063 


10082 1 


6 


ceo in MO* 


ucciggaiacactgncc 


9ooo 


9872 


ceo m KIO. 4AOO 

ObQ ID NO: 1088 


ggaactgttgactcaggaa 


12569 


12588 1 


6 


Qco in MO* ne 
ocU ID NO. OO 


gaaatctcaagcTuctct 


10042 


4 nno4 

10061 


oco m Ki^>. 4Aon 

oEQ ID NO: 1089 


agagccaggtcgagctttc 


11044 


11063 1 


6 


ceo in MO* R7 


luCucaicucaictgx 


1U21U 


"i nooQ 
1U229 


OCO in KIO. Af\f\t\ 

bbU ID NO: 1090 


acagctgaaagagatgaaa 


13055 


13074 1 


6 


cpo in MO* An 
ocU ID NO. OO 


iciaccgctaaaggagcag 




Ar\CAt\ 

10540 


oco ir\ KIO. 4nri4 

ocQ ID NO. 1091 


ctgcacgctttgaggtaga 


11761 


11780 1 


6 


ceo in MO* oa 
ocU ID NO. o\a 


ctaccgciaaaggagcagt 


1U022 


A f\CA A 

10541 


SEQ ID NO: 1092 


actgcacgctttgaggtag 


11760 11779 1 


6 


ceo in MO- OA 

ocU ID IMO, 9U 


agggcctctttttcaccaa 


10831 


10650 


SEQ ID NO: 1093 


ttggccaggaagtggccct 


10957 


10976 1 


6 


ceo in MO* 
ocU ID iMO. 91 


nctccatccctgtaaaag 


11265 


A A OO A 

11284 


SEQ ID NO: 1094 


ctttttcaccaacggagaa 


10838 


10857 1 


6 


ceo in MO' QO 

OCU ID iMO. 9^ 


gaaaaacaaagcagattat 


11616 


A A OIC 

11835 


ir^ 4 A AC? 

SEQ ID NO: 1095 


ataaactgcaagatttttc 


13600 


13619 1 


6 


ceo in MO- OO 
ocU ID NO. 90 


actcactcattgattttct 


12682 


A A*T A 4 

12701 


SEQ ID NO: 1096 


agaaaatcaggatctgagt 


14027 


14046 1 


6 


ceo in Kio. fSA 
bbU ID NO: 94 


taaactaatag atgtaatc 


12690 


A A AAA 

12909 


SEQ ID NO: 1097 


gattaccaccagcagttta 


13578 13597 1 


6 


ceo in Mo» OR 
ocU ID NO. 90 


caaaacgagcttcaggaag 


13200 


4 O A ^ A 

13219 


SEQ ID NO: 1098 


cttcgtgaagaatattttg 


13260 


13279 1 


6 


ceo in KIO. oa 
bbU ID no: OO 


tggaataatgctcagtgtt 


2366 


A A AC* 

2385 


SEQ ID NO: 1099 


■ 1 IM 

aacacttacttgaattcca 


10662 


10681 3 


5 


ceo in MO. Q*7 
obU ID NO. 97 


gatttgaaatccaaagaag 


2400 


A >l <l A 

2419 


SEQ ID NO: 1100 


cttcagagaaatacaaatc 


11402 


11421 3 


5 


ceo in MO. QD 
obO ID no: CO 


atttgaaatccaaagaagt 


2401 


A ^ A A 

2420 


SEQ ID NO: 1101 


acttcagagaaatacaaat 


11401 


11420 3 


5 


ceo in MOi oQ 
bbO ID NO. 99 


atcaacagccgcttctttg 


990 


A AAA 

1009 


SEQ ID NO: 1102 


caaagaagtcaagattgat 


4653 


4572 2 


5 


ceo in MO" 'inn 
ocU ID NO. lUU 


tgttngaagactctccag 


1082 


4 4 A«l 

1101 


O^/^ 1 4 4 AO 

SEQ ID NO: 1103 


ctggaaagttaaaacaaca 


6955 


6974 2 


5 


ceo \r\ MO. APiA 

obQ ID NO: 101 


cccttctgatagatgtggt 


1324 


4 A ^ A 

1343 


SEQ ID NO: 1104 


accaaagctggcaccaggg 


13961 


13980 2 


5 


ceo in MO. *ino 
obU ID NO. 1U2 


tgagcaagtgaagaacttt 


1868 


1887 


SEQ ID NO: 1105 


AA 1. A. A. 

aaagccattcagtctotca 


12963 


12982 2 


5 


SEQ ID NO: 103 


atttgaaatccaaagaagt 


2401 


2420 


SEQ ID NO: 1106 


acttttctaaacttgaaat 


9055 


9074 2 


5 


SEQ ID NO: 104 


atccaaagaagtcccggaa 


2408 


2427 


SEQ ID NO: 1107 


ttccggggaaacctgggat 


12721 


12740 2 


5 


SEQ ID NO: 105 


agagcctacctccgcatct 


2430 


2449 


SEQ ID NO: 1108 


agatggtacgttagcctct 


11921 


11940 2 


5 


SEQ ID NO: 106 


aatgcctttgaactcccca 


2610 


2629 


SEQ ID NO: 1109 


tgggaactacaatttcatt 


7012 


7031 2 


5 


SEQ ID NO: 107 


gaagtccaaattccggatt 


3297 


3316 


SEQ ID NO: 1110 


aatcttcaatttattcttc 


13815 


13834 2 


5 


SEQ ID NO: 108 


tgcaagcagaagccagaag 3496 


3515 


SEQ ID NO: 1111 


cttcaggttccatcgtgca 


11376 


11395 2 


5 


SEQ ID NO: 109 


gaagagaagattgaatttg 


3621 


3640 


SEQ ID NO: 1112 


caaaacctactgtctcttc 


10459 10478 2 


5 
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SEQ (D NO* 110 


atactaaaaacacatataa 




481 




• 1113 
. 1 1 1 w 


O vdlcl ly o doy l^elay Wd I 


1 2A(>A 


19ft7R 2 


5 

w 

1 


SEQ ID NO' 111 


tccctcacctccacctcta 


4737 


4756 


s^o in Nn 

<wCU lU iHSy 


' 1114 


Vd^dllwlway alydyyya 


oy \c. 


ftQ31 0 


5 

w 


SEQ ID NO- 112 


atttacaactctoacaa at 


5427 


5446 


SEO ID NO 


■ 111*^ 

1 1 1 w 


aa a/*ff naaaf 
duiiuwidddvuydadl 


wUww 


Q074 2 


w 


SEQ ID NO' 113 


aaaaacctaccaaaataat 


5594 


5613 

WW 1 w 


SEQ ID NO 


• 1116 

1 1 1 w 


flttatnhtnaaarantrrf 
aiiciLy iiy aaawdu iwwi 


1 1 R3n 


1 1 649 2 


5 

w 


SEQ ID NO: 114 


aaaactaaaacacatcaat 


6401 


6420 


SEQ ID NO 


' 1117 
p 1 1 1 f 


d^iyikyw iwd iw iw w i li 


10194 

1 V 1 w*r 


10213 2 


w 


SEQ ID NO: 115 


ctactaaaaacaacaaaaa 


9418 


9437 


SEQ ID NO 

^«baW( 1 IV^^ 


' 1118 


ttrtnatta rfaf*r'fl n can 

I iv.f ty a tia wwd w way wciy 


13574 

1 WWf *T 


13593 2 

1 Wwww ^ 


w 


SEQ ID NO: 1 16 


ttaaaaaaattcttQaaaa 


9582 


9601 

Www 1 


SEQ ID NO 


' 1119 

1119 


tHta9aflnaa9fr*ttr*9a 
iiiiciaooyddaiwiiwdci 


1 wwww 


13R24 2 


w 


SEQ ID NO- 117 


□aaataaaaaaaaatttta 


10743 


10762 

1 W f w& 


SEQ ID NO 


' 1120 




1 U*tw9 


10476 2 

1 w*r r O £ 


ei 
w 


SEQ ID NO" 118 


taaaaaaaataacaaattt 


11984 


12003 

1 &WWW 


SEO ID NO 
«^b\i( 1^ iN^y 


' 1151 


aaaf nfpanr*foffnHr»si 
dddiyiwdywiwliyilwd 


1 uoy*T 


1091 3 2 
1 uy 1 o ^ 


c 
w 


SEQ ID NO* 119 

Wbl*^i( Ih^ l^\^t 1 111/ 


anaatctaacittaitttac 


14035 


14054 


SEO ID NO 


• 1122 


ni*aantf*anpfv*ariHr»r*f 
y wddy luciy l/wl/dy lllA/l 




10939 2 


c 
w 


SEQ ID NO' 120 




18 


37 


SPO in NO 


'1123 

, 1 1 ^w 


oan*^f*QWriQf*ofnort#^Qi* 
wdgCQdUgavcligdgvaC 


wf *fU 


wf w9 i 


c 
w 


SEQ ID NO* 121 




146 


165 

1 uw 


SEO ID NO 


'1194 


udy CiwCciV/dy aw iCvy WW 


3nR9 


30A1 1 
OwO 1 i 


e 

w 


SEQ ID NO" 122 


viy vy viy viy vfcy viy VI 


154 


173 

1 / w 


SEO ID NO 


1 l^w 


anpanaanntn^naan^an 

ay cdy day g ly cy d ay cay 


3294 


3243 1 


c 
w 


SEQ ID NO- 123 


Qctactaocciaacciccacici 

y viy \*ky y vyy y vy wwayy 


170 


189 


SEO ID NO 


1126 


p f^f nn aH rT'a/* af n p an 
wUiyydiiwi/dvdiy wdyv/ 


1 1fi4fi 

1 1 OHO 


11665 1 
1 lOOw 1 


e 
w 


SEQ ID NO* 124 


aaaaoQ a aatactaaaaa a 

*^«y ^y y EiGiiy w i^^MMCi CI 


193 


212 


SEQ ID NO 


' 1127 


iiLiiwiiMav<ldwalWli 


25ft4 

^wOH 


2603 1 

^WWW 1 


c 
w 


SEQ ID NO: 126 


ctaaaaaatatcaacctcia 


204 


223 

fc&W 


SEQ ID NO 


1128 


pr*anaf*ttpf*anatpf*f*an 
wway a w I iw wo wd iw wwdy 


WW 1 w 


3934 1 


C 
w 


SEQ ID NO: 126 


taoaatccctaaaactact 


296 


315 

w 1 w 


SEQ ID NO 


■ 1129 

1 1 &w 


ay wa ly \^\* UBiy 1 1 iwiwwd 


Q945 

w?7*TW 


9964 1 

WWO*T 1 


R 
w 


SEQ ID NO: 127 


a aaatcccta aaactacta 

yy wy »wwwiyyywvfcyw*y 


297 


316 

W IW 


SEQ ID NO 


' 1130 

t 1 1 WW 


cs4ncatnootftnttfr*fpp 
waywdiy wwinyiiiwiUw 


9944 


9963 - 1 

wwDw . 1 


c 

w 


SEQ ID NO- 128 


toQCiactactaattcaaaa 

wy yy awiy wiy aiivcicsiy a 


305 


324 

Wfc~ 


SEO ID NO 


1131 


iwiiwwaiwdwiiydwWwd 


2042 


2061 1 


W 


SEQ ID NO: 129 


ctactaattcaaaaaafac 

*• iy V >gjwfctvwwy wwy ty V 


310 

W 1 V 


329 

Wd W 


SEQ ID NO 


• 1132 

» 1 1 Wfc 


ywdwdwwiiydwdiiywdy 


1 1079 

1 Iwf w 


11096 1 

1 1 wwO 1 


c 
w 


SEQ ID NO: 130 


toccaccaaaatcaactac 


326 


345 

w^ w 


SEQ ID NO 


■1133 

1 1 WW 


y wayy wiy aawLyy ly y wa 


2717 

A. f If 


2736 1 

£. 1 WW 1 


w 


SEQ ID NO: 131 


a ccacca a a atcaactcica 

y y y *wd vy wn 


327 


346 

wTW 


SEO ID NO* 


'1134 

1 1 w*t 


tnpannpfnaapfnnfnnp 


2716 
Iw 


2735 1 
£r WW 1 


R 
w 


SEQ ID NO* 132 


iy vcmyy iiyoy wiyyayy 


342 


361 


SEO ID NO 


1135 

> 1 1 WW 


pptppapptptnafptripa 

wwiwudwi/iwiydiuiyud 


4744 


4763 1 


c 
w 


SEQ ID NO: 133 


caaocittaaactaaaGatt 

'^w^yy **y »y vvy y wyy it 


344 


363 

www 


SEO ID NO 

wl~^C Ik/ 


1136 

1 1 WW 


aapppptapatnaanpHri 
aawwwwidwdiy ddy^iiy 


13755 

1 W f WW 


13774 1 


R 

w 


SEQ ID NO" 134 


ctctacaacttcatcctaa 


369 


388 

wwW 


SEQ ID NO' 


' 1137 

1 1 w r 


iwdyyddy wiiwiwddydy 


13211 

1 Wb 1 1 


13230 1 

1 w^wU 1 


R 
w 


SEQ ID NO* 135 


caacttcatccta aaa acc 


374 


393 

www 


SEO ID NO 


1138 

1 1 1 wW 


y y iwviy ay iiddd ly wiy 


4977 


4996 1 

*TWwO 1 


R 
W 


SEQ ID NO: 136 


acttcatcctdaaasccsa 

y u 1 iud ly CI CI y d V VCI y 


376 


395 

www 


SEO ID NO' 

*WC^V4 IW l^w< 


1139 

1 1 WW 


ptnnapnpfaanannaanp 
wiyydwy widdydyyddyv/ 


ft5(% 


674 1 
Or *f 1 


R 
w 


SEQ ID NO' 137 


toatcctaaaaaccaacca 

twd iws/ty ddy dwdy wWQ 


379 


398 

www 


SEO ID NO 


1 140 


tnncatnnpattafnatna 
ly y i«diy y wdiid ly diy d 


3604 


3623 1 


R 
w 


SEQ ID NO' 138 


CI aaa acc a fid a ctnf a a n 

y ddddvwaay ddwiviy ay 


452 


471 


SEO ID NO 


1 141 
1 1 "t 1 


pfp a apptf aa fn a ftffp 
wiwdaww iiddiy aiiiiV/ 




6305 1 
Owww 1 


R 
w 


SEQ ID NO* 139 


aoaantntaanrmntftrir* 


460 


479 

•t 1 w 


SPO ID NO' 


• 1142 


yUadyuidldwayiailCI 


A37T 


A3QR i 

ooyo i 


R 

W 


SEQ ID NO* 140 


iwiydy y ay Liiy uiy uay 




4A4 


cpn in Kin 


1 14*^ 
1 IHO 


cigcaggggaicccccaga 


^04O 


40H0 1 


c 
0 




luQCiQcagccatQtcca 


474 


493 


ocQ ID NO; 


A A A A 

1 144 


tggaagtgtcagtggcaaa 


10372 


10391 1 


5 


SEQ ID NO: 142 


caagaggggcatcatttct 


578 


597 


SEQ ID NO: 


1145 


agaataaatgacgttcttg 


7035 


7054 1 


5 


SEQ ID NO: 143 


tcactttaccgtcaagacg 


674 


693 


SEQ ID NO: 


1146 


cgtctacactatcatgtga 


4360 


4379 1 


5 


SEQ ID NO: 144 


tttaccgtcaagacgagga 


678 


697 


SEQ ID NO; 


1147 


tccttgacatgttgataaa 


7366 


7386 1 


5 


SEQ ID NO: 145 


cactggacgctaagaggaa 


853 


872 


SEQ ID NO. 


. 1148 


ttccagaaagcagccagtg 


12498 


12517 1 


5 


SEQ ID NO: 146 


aggaagcatgtggcagaag 


867 


886 


SEQ ID NO; 


: 1149 


cttcatacacattaatcct 


9988 


10007 1 


5 


SEQ ID NO: 147 


caaggagcaacacctcttc 


893 


912 


SEQ ID NO. 


1150 


gaagtagtacfgcatcttg 


6835 


6854 1 


5 
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^bU ID NO: 148 
SEQ ID NO: 149 
SEQIDNO: 150 
SEQ ID NO: 151 
SEQ ID NO: 152 
SEQIDNO: 153 
SEQ ID NO: 154 
SEQIDNO: 156 
SEQIDNO: 156 
SEQ ID NO: 157 
SEQ ID NO: 158 
SEQ ID NO: 159 
SEQ ID NO: 160 
SEQ ID NO: 161 
SEQ ID NO: 162 
SEQIDNO: 163 
SEQ ID NO: 164 
SEQIDNO: 165 
SEQ ID NO: 166 
SEQ ID NO: 167 
SEQ ID NO: 168 
SEQ ID NO: 169 
SEQ ID NO: 170 
SEQ ID NO: 171 
SEQ ID NO: 172 
SEQIDNO: 173 
SEQ ID NO: 174 
SEQ ID NO: 175 
SEQ ID NO: 176 
SEQ ID NO: 177 
SEQ ID NO: 178 
SEQ ID NO: 179 
SEQ ID NO: 180 
SEQ ID NO: 181 
SEQ ID NO: 182 
SEQ ID NO: 183 
SEQ ID NO: 184 
SEQ ID NO: 185 



acagactttgaaacttgaa 959 978 

tgatgaagcagtcacatct 1 1 87 1 206 

agcagtcacatctctcttg 1 1 93 1 212 

ccagccccatcactttaca 1231 1250 

ctccactcacatcctccag 1280 1299 

catgccaacccccttctga 1314 1 333 

gagagatcttcaacatggc 1390 1409 

tcaacatggcgagggatca 1399 1418 

ccaccttgtatgcgctgag 1429 1448 

gtcaacaactatcataaga 1455 1474 

tggacattgctaattacct 1501 1520 

ggacattgctaattacc^ 1502 1521 

ttctgcgggtcattggaaa 1573 1592 

cx:agaactcaagtcttcaa 1620 1639 

agtcttcaatcctgaaatg 1630 1 649 

Igagcaagtgaagaacttt 1868 1887 

agcaagtgaagaactttgt 1870 1889 

tctgaaagaatctcaactt 1 964 1 983 

actgtcatggacttcagaa 1986 2005 

acttgacccagcctcagcc 2051 2070 

tccaaataactaccttcct 2096 2115 

actaccctcactgcctttg 2133 2152 

ttggatttgcttcagctga 2149 2168 

ttggaagctcttttfggga 2211 2230 

ggaagctcttUtgggaag 2213 2232 

tttttcccagacagtgtca 2238 2257 

agacagtgtcaacaaagct 2246 2265 

ctttggctataccaaagat 2321 2340 

caaagatgataaacatgag 2333 2352 

gatatggtaaatggaataa 2355 2374 

ggaataalgctcagtgttg 2367 2386 

tttgaaatccaaagaagtc 2402 2421 

gatcccccagatgattgga 2534 2553 

cagatgattggagaggtca 2541 2560 

agaatgacttttttcttca 2575 2594 

gaactccccactggagctg 2619 2638 

atatcttcatctggagtca 2652 2671 

gtcattgctcccggagcca 2667 2686 



SEQ ID NO: 1151 
SEQ ID NO: 1152 
SEQ ID NO: 1163 
SEQ ID NO: 1154 
SEQ ID NO: 1155 
SEQIDNO: 1156 
SEQ ID NO: 1157 
SEQ ID NO: 1158 
SEQ ID NO: 1159 
SEQIDNO: 1160 
SEQIDNO: 1161 
SEQ ID NO: 1162 
SEQIDNO: 1163 
SEQ ID NO: 1164 
SEQ ID NO: 1165 
SEQ ID NO: 1166 
SEQ ID NO: 1167 
SEQ ID NO: 1168 
SEQ ID NO: 1169 
SEQ ID NO: 1170 
SEQ ID NO: 1171 
SEQ ID NO: 1172 
SEQ ID NO: 1173 
SEQ ID NO: 1174 
SEQ ID NO: 1175 
SEQ ID NO: 1176 
SEQ ID NO: 1177 
SEQ ID NO: 1178 
SEQ ID NO: 1179 
SEQ ID NO: 1180 
SEQ ID NO: 1181 
SEQ ID NO: 1182 
SEQ ID NO: 1183 
SEQIDNO: 1184 
SEQ ID NO: 1185 
SEQ ID NO: 1186 
SEQ ID NO: 1187 
SEQIDNO; 1188 



ttcaattcttcaatgctgt 

agatttgaggattccatca 

caaggagaaactgactgct 

tgtagtctcctggtgctgg 

ctggagcttagfaatggag 

tcagatgagggaacacatg 

gccaccctggaactotctc 

tgatccxacctctcattga 

ctcagggatctgaaggtgg 

tcttgagttaaatgctgac 

aggtatattcgaaagtcca 

caggtatattcgaaagtcc 

tttcacatgccaaggagaa 

ttgaagtgtagtctcctgg 

catttctgattggtggact 

aaagtgccacttttactca 

acaaagtcagtgccctgct 

aagtccataatggttcaga 

ttctgaatatattgtcagt 

ggctcaccctgagagaagt 

aggaagatatgaagatgga 

caaatttgtggagggtagt 

tcagtataagtacaaccaa 

tcccgattcacgcttccaa 

cttcagaaagctaccttcc 

tgaccttctctaagcaaaa 

agcttggttltgccagtct 

atctcgtgtctaggaaaag 

ctcaaggataacgtgtttg 

ttatcttattaattatatc 

caacacttacttgaattcc 

gacttcagagaaatacaaa 

tccaatttccctgtggatc 

tgaccacacaaacagtctg 

tgaagtccggattcattct 

cagctcaaccgtacagttc 

tgacttcagtgcagaatat 

tggccccgtttaccatgac 



10500 10519 1 5 
7976 7995 1 5 
6524 6543 1 5 
5094 5113 1 5 
8709 8728 1 5 
8919 8938 1 5 
10869 10888 1 5 
2965 2984 1 5 
8187 8206 1 5 
4979 4998 1 5 
12799 12818 1 5 
12798 12817 1 5 
6514 6533 1 5 
5088 5107 1 5 

7757 7776 1 5 

6183 6202 1 5 

6007 6026 1 5 

12811 12830 1 5 

13376 13395 1 5 

12391 12410 1 5 

4712 4731 1 6 

10319 10338 1 5 

9392 9411 1 6 

11577 11596 1 5 

7929 7948 1 5 

4876 4895 1 5 

2458 2477 1 5 

5968 5987 1 5 

12609 12628 1 5 

13079 13098 1 5 

10661 10680 1 5 

11400 11419 1 5 

3681 3700 1 5 

5363 5382 1 5 

11015 11034 1 5 

11861 11880 1 5 

11966 11985 1 5 

5809 5828 1 5 
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SEQIDNO: 186 


gctgaagtttatcattcct 


2873 


2892 


SEQ ID NO: 1189 


aggaggctttaagttcagc 


7600 


7619 1 


5 


SEQ ID NO: 187 


attccttccccaaagagac 


2886 


2905 


SEQ ID NO: 1190 


gtctcttcctccatggaat 


10470 10489 1 


5 


SEQ ID NO: 188 


ctcattgagaacaggcagt 


2976 


2995 


SEQ ID NO: 1191 


actgactgcacgctttgag 


11756 


11775 1 


5 


SEQ ID NO: 189 


ttgagcagtattctgtcag 


3142 


3161 


SEQ ID NO: 1192 


ctgagagaagtgtcttcaa 


12399 


12418 1 


5 


p^ p^ ■ p^ ft. 1 J A A 

SEQ ID NO: 190 


accttgtccagtgaagtcc 


3285 


3304 


SEQ ID NO: 1193 


ggacggtactgtcccaggt 


12784 


12803 1 


5 


SEQ ID NO: 191 


ccagtgaagtccaaattcc 


3292 


3311 


SEQ ID NO: 1194 


ggaaggcagagtttactgg 


9148 


9167 1 


5 


SEQ ID NO: 192 


acattcagaacaagaaaat 


3394 


3413 


SEQIDNO: 1196 


atttcctaaagctggatgt 


11167 


11186 1 


5 


SEQ ID NO: 193 


gaaaaateaagggtgttat 


3463 


3482 


SEQ ID NO: 1196 


ataaactgcaagatttttc 


13600 


13619 1 


5 


SEQ ID NO: 194 


aaatcaagggtgttatttc 


3466 


3485 


SEQ ID NO: 1197 


gaaacaatgcattagattt 


9745 


9764 1 


5 


JF^ 1 P^ ft. ■ -4 ^ 

SEQ ID NO: 195 


tggcattatgatgaagaga 


3609 


3628 


SEQ ID NO: 1198 


tctcccgtgtataatgcca 


11781 


11800 1 


5 


SEQ ID NO: 196 


aagagaagattgaatttga 


3622 


3641 


SEQ ID NO: 1199 


tcaaaacctactgtctctt 


10458 10477 1 


5 


SEQ ID NO: 197 


ft * 1 ■ M ft 

aaatgacttccaatttccc 


3673 


3692 


SEQ ID NO: 1200 


gggaactacaatttcattt 


7013 


7032 1 


5 


SEQ ID NO: 198 


atgacttccaatttccctg 


3675 


3694 


SEQ ID NO: 1201 


caggctgattacgagtcat 


4917 


4936 1 


5 


SEQ ID NO: 199 


acttccaatttccctgtgg 


3678 


3697 


SEQ ID NO: 1202 


ccacgaaaaatatggaagt 


10360 


10379 1 


5 


SEQ ID NO: 200 


agttgcaatgagctcatgg 


3803 


3822 


SEQ ID NO: 1203 


ccatcagttcagataaact 


7989 


o n n n A 

8008 1 


5 


SEQ ID NO: 201 


ft i ft _ _ _^ m m 

tttgcaagaccacctcaat 


3860 


3879 


SEQ ID NO: 1204 


attgacc^tccattcaaa 


13671 


13690 1 


5 


SEQ ID NO: 202 


RA ft. 

gaaggagttcaacctccag 


3884 


3903 


SEQ ID NO: 1205 


ctggaattgtcattccttc 


11728 


11747 1 


5 


SEQ ID NO: 203 


acttcx:acatcccagaaaa 


3919 


3938 


SEQ ID NO: 1206 


ttttaacaaaagtggaagt 


6821 


6840 1 


5 


SEQ ID NO: 204 


■ ft 1 J ft ■ 

ctcttcttaaaaagcgatg 


3939 


3958 


SEQ ID NO: 1207 


catcactgccaaaggagag 


8486 


8505 1 


5 


SEQ ID NO: 205 


■ ft 

aaaagcgatggccgggtca 


3948 


3967 


SEQ ID NO: 1208 


tgactcactcattgatttt 


12680 


12699 1 


5 


SEQ ID NO: 206 


ftft iftft ifttfB i 

ttcctttgccttttggtgg 


4003 


4022 


SEQ ID NO: 1209 


ccacaaacaatgaagggaa 9256 


9275 1 


5 


SEQ ID NO: 207 


caagtctgtgggattccat 


4079 


4098 


SEQ ID NO: 1210 


atgggaaaaaacaggcttg 


9566 


9585 1 


5 


#%P**^^ ft. ft 

SEQ ID NO: 208 


aagtccctacttttaccat 


4117 


4136 


SEQ ID NO: 1211 


atgggaagtataagaactt 


4834 


4853 1 


5 


SEQ ID NO: 209 


tgcctctcctgggtgttct 


4159 


4178 


SEQIDNO: 1212 


agaaaaacaaacacaggca 9643 


9662 1 


5 


SEQ ID NO: 210 


accagcacagacQatttca 


4242 


4261 


SEQ ID NO: 1213 


tgaagtgtagtctcctggt 


5089 


5108 1 


5 


p* ■ p^ ft. 1 ^ ^ 

SEQ ID NO: 211 


ccagcacagaccatttcag 


4243 


4262 


SEQ ID NO: 1214 


ctgaaatacaatgctctgg 


5511 


5530 1 


5 


SEQ ID NO: 212 


actatcatgtgatgggtct 


4367 


4386 


SEQ ID NO: 1215 


agacacctgattttatagt 


7948 


7967 1 


5 


SEQ ID NO: 213 


accacagatgtctgcttca 


4498 


4515 


SEQ ID NO: 1216 


tgaaggctgactctgtggt 


4282 


4301 1 


5 


SEQ ID NO: 214 


ccacagatgtctgcttcag 


4497 


4516 


^m ft fti^ k ■ .A M 

SEQ ID NO: 1217 


ctgagcaacaaatttgtgg 


10311 


10330 1 


5 


SEQ ID NO: 215 


tttggactccaaaaagaaa 


4520 


4539 


SEQ ID NO: 1218 


tttctctcatgattacaaa 


5933 


5952 1 


5 


SEQ ID NO: 216 


ft. * ftft 

tcaaagaagtcaagattga 


4552 


4671 


SEQIDNO: 1219 


tcaaggataacgtgtttga 


12610 


12629 1 


5 


SEQ ID NO: 217 


atgagaactacgagctgac 


4798 


4817 


SEQ ID NO: 1220 


gtcagatattgttgctcat 


10187 


10206 1 


5 


SEQ ID NO: 218 


ttaaaatctgacaccaatg 


4818 


4837 


SEQ ID NO: 1221 


cattcattgaagatgttaa 


7342 


7361 1 


5 


SEQ ID NO: 219 


gaagtataagaactttgcc 


4836 


4857 


SEQ ID NO: 1222 

• 


ggcaaatttgaaggacttc 


11994 


12013 1 


5 


SEQ ID NO: 220 


aagtataagaactttgcca 


4839 


4858 


SEQ ID NO: 1223 


tggcaaatttgaaggactt 


11993 12012 1 


5 


SEQ ID NO: 221 


ttcttcagcctgctttctg 


4941 


4960 


SEQ ID NO: 1224 


cagaatccagatacaagaa 


6884 


6903 1 


5 


SEQ ID NO: 222 


ctggatcactaaattccca 


4957 


4976 


SEQ ID NO: 1225 


tgggtctttccagagccag 


11033 11052 1 


5 


SEQ ID NO: 223 


aaattaatagtggtgctca 


5014 


5033 


SEQ ID NO: 1226 


tgagaagccccaagaattt 


6248 


6267 1 


5 
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5073 


5092 


SEO in NO* 1227 


fraaaHnf^tnnatflngnt 


9848 


9867 1 


5 






i/&vO 


'5957 


SPO in NO* 192B 


(v^nac^Rttcftcatacca a 

wwiy ClwwLlwOVOi idwvuj^ 


8310 


8329 1 


5 




n^asaaaoatfHnaaptt 


5278 


5297 


SEO ID NO* 1229 


aaataaaaaaaaattttnc 

ciovy iciaaajjciGiGict nii^w 


10744 


10763 1 


5 


SFO in MO' 007 


aaaaa/^aftff^aaf^tt^a 
aaad aUalU Iwd d U llwd 


^5980 




QFO in NO* 1930 


tn;3nntaaaaaaaaatttt 


10742 


10761 1 


5 




tf^ont^ a a n o a n /I a Afta a 
iway iCddy day y dullad 


'5'^09 

«JwV/iC 


5391 


SPO ID MO* 1231 


tfasflflacttccattctna 


13363 


13382 1 


5 


QPn in Kin* ooo 


ICaSaiy aCaig dig gyoi 


'5'^9'i 


5344 


SPO ID MO* 1239 


ciy WW w6i iwooi icj iwGiiiy a 


6205 


6224 1 


5 

w 


QPn in MO* 9*in 


UaCdUaddCciy lUig ddUd 




53AR 


SPO in MO* 1 933 


tnf ttr*99ptn pf"*f ftntn 
ly iiiwddw ly Liiy ly 


11219 11238 1 


5 

w 




kf*k\f^^ o a 0#*a 
ICllCdBaaWligolCaava 


^^400 


549 ft 


SPO in MO* 1934 


ly illlww idiiiwwcsiay a 


12835 12854 1 


5 

w 


QPO in MO' 0%0 


^aantt4taf9anr*aQQr*i 
CddyuildcddyudddCl 


<^44i 


54R0 


SPO in MO* 1935 


ayiidiuiywiaaaoiiy 


14043 


14062 1 


5 

w 


QPO in MO* 


tn n f a a f^f a ^ tHa a a^an 
ly y laaulaClUaoldWciy 


';4flR 

w*twO 


5507 


SPO ID NO* 1236 


wiyiiiiiciyay^ciaawvQi 


7512 


7531 1 


5 

w 


<5PO in MO- O'KA. 


Qja/^acif rt af*r"fn aa aJar'a 
ddCdyiydCuLyddaldvd 


'^'109 

0«/V£. 


5591 


SPO in MO* 1237 


f'^^af an^a aaftf^ctatt 
iy td idy waaciiiwwi^ Li 


5890 


5909 1 


5 

w 


9Pn in MO* 9*^*5 


yy y dddL'idwy y w toy aciv 


5544 

W*/*T*T 


5563 


SEO ID NO* 1238 


ntfocttccataatttecc 


10933 


10952 1 


5 


^PO in MO' 9'^fi 


aaf*af*af^*fafnf*l'^a♦^*^f* 
daii/dvdlWldiyuwdlUlw 


'iR90 

wV/^l/ 


5639 


SPO ID NO* 1239 


nanaRanftatRttRcitntt 
y Giy away waiwiiwy lyii 


11204 


11223 1 


5 

w 


RPO in MO* OXT 


iK«dy i^oy ^la idddy ^ciy 


5652 


5671 


SEO ID NO- 1240 


ctcictaaaaaccttactcia 


7780 


7799 1 


5 


QPO in MO' 9^S 


y vdy d vciwiy iiyvicieiy y 


5667 


5686 


SPO ID MO* 1241 


nRfttraaacactaactac 

V w iiiwaay wa w%y a wig w 


11746 11765 1 


5 

w 


QPO in MO* 9^Q 


f^t/innn ana a^ataf^trifi 

iviyyyydyddudidtiyy 


5&66 


5ftft5 


SEO ID NO* 1242 


prannttttftT a ca aaa 
w way y uiiwwa w way a 


8038 


8057 1 


5 


QPO in MO* 940 


HfH'f^triatnaHar^aan 
iii/iuii/diydUawciady 


50*^4 

i/ww*t 


5953 


SPO ID NO* 1243 


fttttftftaecaacaa a a a a 

wiiiiiwawwaawyy ayaa 


10838 10857 1 


5 


RPO in MO* 941 


uiydywdydi/dyyudvW>iy 


6034 
v/uu*r 


6053 

Www 


SEO ID NO* 1244 


nannannnfttaanttnan 
way y ay y w liiaay iiway 


7599 


7618 1 


5 

w 


QPO in MO- 949 


/*aaHtaanaa/^aafnaaf 
UddllladUdauaaiy ad I 




R0R5 


SPO ID MO* 1245 


a iiww I iwwiivawcici iiy 


8082 


8101 1 


5 

w 


QPO in MO* 94*^ 
OCV4 IL/ INVm/. 


lyyavyddciciyyciydo 


R140 


6159 


SPO ID NO* 1946 


yiwdy wwwoyiiwwiiwwa 


10924 10943 1 


5 

w 


QPO in NO- 944 


cttttactcagtgagccca 


6192 


fi911 


QPO ID MO* 1947 

wCW IL/ INV/i lfii*Tr 


tnnnrtaaaf*ni9taaaan 
lyyy wiciaawy loiyaaay 


7827 


7846 1 


5 

w 


QPO in MO* OAR 


tcattgatgctttagagat 


6217 


6936 


SPO in NO* 194R 


dlwUwdiddy liwdaiyci 


13174 13193 1 


5 

w 


QPO in MO* 94f? 


aaaaccaagatgttcactc 


6295 


R314 


QPO in MO* 1949 


y ay ly oaa ly wiy li&iii 


8630 


8649 1 


5 

w 


QPO in MO- 947 


aggaatcgacaaaccatta 


6357 




SPO in MO' 1950 


uadiydUiiwddyiiwwi 


8294 


8313 1 


5 

w 


QPO in MO* 94ft 


tagttgtactggaaaacgt 


6376 


6395 


SPO ID MO* 1951 


9(^nH9fmr*trf9an9Rt9 
awy iiay WW iwiaay a w la 


11928 


11947 1 


5 

w 


QPO in MO- 94Q 


ggaaaacgtacagagaaag 


6386 


R405 


SPO ID MO* 1959 


wiiLiawddiLWdiuiww 


13014 13033 1 


5 

w 


QPO in MO* 9^n 


gaaaacgtacagagaaagc 6387 


6406 


QPO ID MO* 1253 


y wiiiw iw I (.wwd wd I liw 


10052 


10071 1 


5 


QPO in MO* 9*^1 


aaagctgaagcacatcaat 


6401 


6490 


SPO ID MO* 1954 


aiiyaiyiiayay ly will 


6984 


7003 1 


5 


QPO in MO' 0^0 


aagctgaagcacatcaata 


6402 


6491 


SPO ID MO 1255 


la iiy aiy iiayciy ly wii 


6983 


7002 1 


5 

w 


QPO in MO* 9*?*^ 


tgaagcacatcaatattga 


6406 


6495 


SPO ID NO* 1256 


fftaaAAttaataattttca 
iwa a WW i&aa ly a 1 1 1 iwa 


8287 


8306 1 


5 


QPO in MO' 9*54 


atcaatattgatcaatttg 


6414 


6433 


SEQ ID NO* 1257 


caaaaccatcactoatoat 

wGwia M w wci iwci w 0 CI i 


1660 


1679 1 


5 

w 


ocQ ID no: 2oo 


taatgattatctgaattca 


6476 


D4g0 


obw lU INU. l49o 


igaaaicaugaaaaana 


6719 


6738 1 


e 
0 


SEQ ID NO: 256 


gattatctgaattcattca 


6480 


6499 


SEQ ID NO: 1259 


tgaagtagctgagaaaatc 


7094 


7113 1 


5 


SEQ ID NO: 257 


aattgggagagacaagttt 


6498 


6517 


SEQ ID NO: 1260 


aaacattcctttaacaatt 


9488 


9507 1 


5 


SEQ ID NO: 258 


aaaatagctattgctaata 


6693 


6712 


SEQ ID NO: 1261 


tattgaaaatattgatttt 


6806 


6825 1 


5 


SEQ ID NO: 259 


aaaattaaaaagtcttgat 


6731 


6750 


SEQ ID NO: 1262 


atcatatccgtgtaatttt 


6757 


6776 1 


5 


SEQ ID NO: 260 


ttgaaaatattgattttaa 


6808 


6827 


SEQ ID NO: 1263 


ttaatcttcataagttcaa 


13171 


13190 1 


5 


SEQ ID NO: 261 


agacatccagcacctagct 


6938 


6957 


SEQ ID NO: 1264 


agcttggttttgccagtct 


2458 


2477 1 


5 
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SEQ ID NO: 262 


C3atncaiu93339ddt 


/021 


f U4U 


OCU lU INU. 1200 




OUO^ 


Rini 1 

0 IV 1 1 


5 

w 


SEQ ID NO: 263 


aQQiutaaiggataaau 


"7 AT A 


7*1 

f 1570 




a aiig iiy a olciy aoci awui 


1*^147 


1*^166 1 


5 


SEQ ID NO: 264 






7*3 CO 

r 202 




gyaUdeiygMA^ayclcllwiy 


19'vl'i 


12564 1 


5 


SEQ ID NO: 265 


taagataaaagattactu 


72o2 


r2o1 


ceo in MO' 'tOAft 
OCU lU ri\J. 1200 




I'^l'^'^ 

1 w i 


13174 1 


5 

w 


SEQ ID NO: 266 


aaagauacuigag^^^i 




r200 


QPO in MO» 10RQ 


aUlUlladolwdlluUlll 


9'tO I 


9500 1 

9wW 1 


5 

w 


SEQ ID NO. 2d7 


g agaaanagug gatna 


f ^0 1 


t OUU 


QPO in MO* 1070 


hfl9 9 rt f^patf ^^ 9 nffif ritft 


12962 

1 &9U&> 


12981 1 


5 


SEQ ID NO. 200 


aniaiigaigaiycigic 


r ^:^30 


7*514 




nsp 9tnHnsit999n9Si9f 
^ctucil^uydidda^daai 


7371 


7390 1 

r www 1 


5 

w 


oEQ ID no: 269 


gaauaiciiiiaaaacai 


r 0^0 


f 0*rO 


Qpo in wn* 1079 


9frnt9lf^99tnn9R9ttR 

diyidiwdOdiyjjdwciiiw 


7677 

f W I f 


7696 1 

# Vww 1 


5 


SEQ ID NO: 270 


naccaccagiugiagai 




7/1*30 


^FO in NO* 107*^ 


dLi/iy^ddi/wiiydd^idd 


10731 


10750 1 

1 V r w w 1 


5 


SEQ ID NO: 271 


ttgcagtgtaiciggaaag 




/00«7 


QFO in MO* 197d 


f^ftftr'scaftan 9tnr'9 9 
ullllvdVd luay diy vd d 


8412 


8431 1 


w 


SEQ ID NO. 272 


caiicag caggaacncaa 


f oyi 


7710 


QFO in NO* 197R 


iiyddyydwiLwdyyadiy 


12001 


12020 1 


5 


SEQ ID NO. 273 


acaccigauuaiagicc 


/you 


7Q(%Q 


QFO in NO* 107R 


nn9ptp99rin9t99f*ntnf 
y^dwiUddy^dladuy L^l 


12606 


12625 1 


5 


SEQ ID NO: 274 


ggattccatcagttcagat 




OUUO 


QCO in MO- 1077 


dlCuCdd^alldldWC 


I'^llf^ 


1313S 1 


5 


SEQ ID NO: 275 


ttgtagaaatg aaagtaaa 


8104 


012o 


QCO in MO" 107ft 
oCW lU IMW. l2rO 


ludxg auaigicdoCdd 




12?71 1 


1^ 

w 


SEQ ID NO: 276 


ctgaacagtgagctgcagt 


Ol4o 


f)He7 
OlDf 


QCO in MO» 1070 


aciggacuciciagicag 


OOw 1 


RR9n 1 


5 


SEQ ID NO: 277 


aatccaatctcctcttttc 




o4l0 


ceo in MO* iOftfi 
OCUl lU IMU. 120U 


gddddaigddgiccgg dU 


11009 


11028 1 


<; 


SEQ ID NO: 276 


aitttgattttcaagcaaa 


oo24 


Q.CA 0 

oo4o 


ceo in MO* -lOQI 

ob\J lU V4\J. 1201 


uigcaaguaaugaaaai 


14015 


14034 1 


c 

w 


SEQ ID NO: 279 


1 111 juj-lIH l.n..r^jn..ni.rLrL j-LJ-Lf 

ttttgatutcaagcaaat 


002 D 


0044 


QCO in MO- 10R0 


al4tnaHfaonfrnt9999 
alliy alUdagig Iddacl 


9814 


9633 1 


5 


SEQ ID NO: 2ou 


tgattttcaagcaaatgca 


OD20 


004/ 


ceo in MO* lOR*^ 


tnr>99ntt999n9999tP9 

iy(rfddyUdddyddddll«d 


14017 


14036 1 


5 

w 


SEQ ID NO: 2o1 


^4^ ^4^444444^ m ^^«'^4m 

atgctgtiuuggaaaig 


oOof 


0000 


ceo in MO* lOft^ 


/«9Hnpit9nn9n9P9nr^9fr 

udiiyyidyydydwdyk^i 


11195 


11214 1 


5 

w 


SEQ ID NO: 2o2 


tgctgttttttggaaatgc 


0000 


000 r 


ceo in MO* 10R'^ 
oew 11^ iH\j, 1^00 


no9ttnrif9nn9n9r>9fiP9 
y v/d iiyy idy y dy dudy i^d 


11194 


11213 1 


5 


SEQ ID NO. 2o3 


aaaaaaaiacactggagci 




ft71 7 
Of 1 f 


ceo in MO* lOflft 


9 n pt 9 n 9 n n n 1* ntnt fttf 
dywidydyyyv/viuiiui 


10825 10844 1 


5 

w 


SEQ ID NO: 2o4 


actggagcttagtaaigga 


Or Uo 


0707 


ceo in NO* 19R7 


♦f»p9/*f/*909tW*tW*flPlt 

lk#k«di# ivdVfd iiA#ivVdy 1 


1281 


1300 1 


5 

w 


ScQ ID NO. 2oD 


cttctggaaaagggicaig 


Our 0 


ooy f 


ceo in MO' 10RR 


P9fn99PPpr*t9P9(ri99n 
vd ly dd V V V vid i#d ly d ay 


13751 


13770 1 


5 


SEQ ID NO. 20O 


g g aaaag g gica igg aaai 


0000 




ceo in MO* 10RQ 


atttnsasinttopiftttRC 
aiiiy ciaciy iioy iiiiuo 


9274 


9293 1 


5 

w 


SEQ ID NO: 287 


gggcctgccccagattcic 


nono 
OS7U2 




ceo in MO* loon 


/i9n99P9tf nfrnrmnnppp 
ydydawdiiuiyyayy 


9432 


9451 1 


5 

w 


SEQ ID NO. 2oo 


ucicagaigagggaacac 


O9 10 


07>30 


ceo in MO' 10Q1 


y ly lw> I lifoei d y u ly a y ca a 


12408 


12427 1 


5 

w 


SEQ ID NO: 269 


gatgagggaacacatgaai 


0922 


0!741 


ceo in MO' 10QO 


9tf/^i'*9nf*Hcf*r»f*9n9tn 
a Lii/Udy w iiwUvwdViHa i v 


8330 


8349 1 


5 

w 


SEQ ID NO: 290 


cntggactgtccaataag 


O9/0 


AQQ7 
099/ 


ceo in MO' 100*^ 


uiidiy y yd iiiuuid ady 


11159 


11178 1 


5 

w 


SEQ ID NO: 291 


gcatccacaaacaatgaag 


QOC*5 

92o2 


y2f 1 


ceo in MO» <iOQ^ 
oeU lU ri\J. \£.^^ 


cucaiciyicauyaiyv 


10219 


10238 1 


5 

w 


SEQ ID NO: 292 


cacaaacaatgaagggaat 


OOC7 


Q07ft 
92/0 


ceo in MO' 10Q*\ 

OCU XlJ [N\J. 14&90 


aiiccciy ady uydiy ly 


11480 11499 1 


5 


SEQ ID NO: 293 


ccaaaatttctctgctgga 


9407 


9426 


SEQ ID NO: 1296 


tccatcacaaatcctttgg 


9663 


9682 1 


0 


SEQ ID NO: 294 


caaaatttctctgctggaa 


9408 


9427 


SEQ ID NO: 1297 


ttccatcacaaatcctttg 


9662 


9681 1 


5 


SEQ ID NO: 295 


tctgctggaaacaacgaga 


9417 


9436 


SEQ ID NO: 1298 


tctcaagagttacagcaga 


13221 


13240 1 


5 


SEQ ID NO: 296 


ctgctggaaacaacgagaa 


9418 


9437 


SEQ ID NO: 1299 


ttctcaagagttacagcag 


13220 


13239 1 


5 


SEQ ID NO: 297 


agaacattatggaggccca 


9433 


9452 


SEQ ID NO: 1300 


tgggcctgccccagattct 


8901 


8920 1 


5 


SEQ ID NO: 298 


agaagcaaatctggatttc 


9467 


9486 


SEQ ID NO: 1301 


gaaatcttcaatttattct 


13813 


13832 1 


5 


SEQ ID NO: 299 


tttctctctatgggaaaaa 


9557 


9676 


SEQ ID NO: 1302 


tttttgcaagttaaagaaa 


14013 


14032 1 


5 
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SEQ ID NO* 300 


tcaaaacatcaaatccttt 

IwCIMGIMwdlwCiCiCiiwwbb* 


9704 


9/ ibO 


QPn in Mn* 1303 

OCVx lU ri\J, lOwO 


dddgddctdiCdggdicigd 


14U40 




c 
O 


SEQ ID NO' 301 


caaaascaatacattaaat 


9743 

W f "Tw 


9762 

wr 


5^PO in Kin* 1304 


dU/ Id ly Lf Ud luicuuig 


5625 


5644 1 


e 

o 


SEQ ID NO- 302 


tacacattaatcctaccat 


9993 

wwww 


10012 


S5FO in ND* 130S 


d y dg iMiid iig ig Id 


14081 


14100 1 


c 

9 


SEQ ID NO- 303 


an f Rfl fl a ta ttatta ctca 


10186 


in2n<? 

1 U4>wO 


RFO in wo- 1308 


igagaacidCgagcigdCi 


4799 


4818 1 


c 

o 


SEQ ID NO* 304 


nnannntaot'nstaacaat 


10328 




^Fo in wn* 1307 


dCiggiggcaaaacccicc 


2726 


2745 1 


D 


SEO ID NO* 305 


wd dcioi^ a dd Liw wctd I 


1 U0S7O 




QCO in MO* 
OCU lu W\J, loUO 


dugaagiacctactntg 


8358 


8377 1 


e 
D 


SEQ ID NO- 308 


aaaaaccoaaattccaatt 

QIGIdOiyww^nGldilWQIdll 


10397 

1 Www r 


10418 


RFO in MO* 130Q 


9 off n o o nt o/^f^f Q /«4l4t 
dd li^ddy IdCCldClIU 


8357 


8376 1 


c 


SEQ ID NO* 307 


ttcaancaaaaacttaata 


10428 


10447 


55FO in MO* 1310 


caiia iggcccucgig aa 


13250 13269 1 


c 
o 


SEQ ID NO* 308 


RpfpftafiftttrRattna 
wViviidwiiiiv^waii^Ci 




1 UwQw 


^Fn in MD' 1 "^11 
ocu \\J wKj. to 1 1 


iCddddgaagcccaagagg 


12939 


12958 1 


D 


SEQ ID NO: 309 


taaaaccaacacttactta 

«B wM M WWMd W wlW kUil W V ItM 


10655 

1 W W W W 


10674 

1 wU r "T 


SFO ID NO* 1319 


t*ddy Cdiuiy d lly dCiCd 


12668 12687 1 


e 
9 


SEQ ID NO* 310 


ca ct tacttci a a ttcca aa 


10664 


10683 

1 Uv/UO 


SFO in NO- 1313 


u L ly ddUdCdd d y iCdyig 


6000 


6019 1 


c 
0 


SEQ ID NO: 311 


aaaataaaaaaaaatttta 


10743 

1 W f W 


10762 

1 w f 


SFO ID NO* 1314 

wC\*< IL^ iNv/. lO l*T 


WdddddtralUlCddCUC 


5279 

W*» f w 


5298 1 

Wto W W 1 


c 
0 


SEQ ID NO* 312 


cctGnaactRtctcAafaa 

Vwl^^GIGIWiVlwlWai^y 


10874 


10893 


SFO In NO* 131«i 


uCdiUdCdydiCiiCdgg 


11364 


11383 1 

1 1 Www 1 


D 


SEQ ID NO* 313 


a^uk^y iddv^V'dwwdy 


11 17R 


1 1 1Q<^ 


^FO in NO* 1*^18 
OCU \U W\J, lOlO 


ciggduccacaxgcagci 


11847 


11866 1 

1 1 www 1 


0 


SEQ ID NO* 314 


aaaattccctaaaattaat 


11477 


11496 


SFO ID NO* 1317 


diCdtdicwyiyiddiiii 


6757 

w • w r 


6776 1 

W f f W 1 


0 


SEQ ID NO* 315 


caa atflflcatta cf arth 




11894 


SFO in NO' 1318 


dddyviydyddyaddXcig 


12416 


12435 1 


D 


SEQ ID NO' 316 


aaatoacatfrintnrtH'n 


1 1 UwV 


11825 


SFO in NO' 1319 


uaaaywiydyddydddlCl 


12415 


12434 1 


c 
D 


SEQ ID NO* 317 


ly iiy dddwdy iww ly yd I 


1 18')4 


1 lOOO 


^FO in NO' i'^9n 


diccaaga^agaicaaca 


13095 


13114 1 

1 W 1 1 ~ 1 


0 


SEQ ID NO* 318 


catattc a aaacf aaotta 

V vi %iw aacic) w ly ay iiy 


12221 


12240 


SFO in NO- 1391 


OQfl/>totoinoff o/«fo4/Y 

v/dd(/icidydUdwLdxy 


13623 


13642 1 


D 


SEQ ID NO* 319 


aaaaatHiat'paasanflan 
ciGidydi(wiiwddddy doy 


1 £www 


1294Q 

l^w*rw 


SFO in NO' 1399 


CUCd dHldUCUClU 


13818 


13837 1 


c 
P 


SEQ ID NO* 320 


a ttttcicaanta atan aa n 
a iiiiwwciciw laci iciy aoy 


13026 

1 ww&U 


1304*5 


SFO in NO- 1393 

wC>riC lU INV./. lO^O 


CllCdddydwllddadddl 


8006 


8025 1 


c 
D 


SEQ ID NO' 321 


dd lid idivwddy d ly dy d 


1308Q 

1 OVOw 


13108 

1 0 1 


^FO in MO- 1*^94 


iciuiiccicCdiygddU 


10471 


10490 1 


c 


SEQ ID NO' 322 


lu^ciyyady wiiwiwddud 


13910 

1 w£ 1 W 




' QFO in NO' 1*^9** 


ivucdiddguCddig aa 


13176 


13194 1 


c 
o 


SEQ ID NO* 323 


iiydywddiiiviy^^udy 


1 0*t^«7 


13448 


SFO in NO* 1*^98 


ciyuydddgdinaiCda 


12924 


12943 1 


C 


SEQ ID NO* 324 


V ly d Id idwd i^^d wy y d y i 


13704 


13793 
1 Or £0 


SFO in NO- 1397 


Qotoootnn4/'io<9<9ftoaM 

dCicaaiggigaaaucag 


7457 


7476 1 


e 
D 


SEQ ID NO* 325 


acatcacnnanttaf^fnaa 
MWBiwdwyydy iidwiydd 


13711 


13730 
1 or ou 


SFO in NO' 1398 


iiCdyddycidayCddiyi 


7231 


7250 1 


c 
D 


SEQ ID NO" 326 


actacctatattaataaaa 

civiy vs^iGiiciiiydidddd 


13874 

1 wO / "T 


13893 


SFO in NO' 139Q 


iiiiyycddycididQdyi 


8372 


8391 1 


D 


SEQ ID NO* 327 


aciaataacattttftflcaa 


14003 

1 "Tww W 


14022 


SEO in NO* 1330 

IL/ lOOv 


ffo O99no99nto(ttoot 

iiy Cddy Wddy ici iLCci 


3005 


3024 1 


e 

o 


SEQ ID NO' 328 


ttttttacaaattaaaaaa 


14012 

1 *YW 1 £, 


14031 


SFO in NO' 1331 


ui.tiuLu Ldiyyy dddddd 


9558 


9577 1 


0 


SEQ ID NO' 329 


tccanaacfcaantfttfcn 
iwwGiy ciciwiwddy iwiiwd 


1619 

1 V ( w 


1838 


SFO in NO' 1339 


lydddLyciyuiuiygd 


8633 


8652 3 




SEQ ID NO: 330 


aattaatoaaaaaanttnt 

« y 1 1" y y oao y a ay 1 1 w I 


1948 

1 w"W 


1967 


SFO in NO* 1333 

wc^Vh lU i^yJt lOOO 


9n99tntnt9r^09nn99of 
dyddiuiyiduudyyddui 


12556 


12575 3 






diUdCdyCICIQdCadgX 




044o 


obU \U NO. 1oo4 


acttcagagaaatacaaat 


11401 


11420 3 


4 


SEQ ID NO: 332 


gattatctgaattcattca 


6480 


6499 


SEQ ID NO: 1335 


tgaaaccaatgacaaaatc 


7421 


7440 3 


4 


SEQ ID NO: 333 


gtgcccttctcggttgctg 


18 


37 


SEQ ID NO: 1336 


cagctgagcagacaggcac 


6031 


6050 2 


4 


SEQ !D NO: 334 


attcaagcacctccggaag 


245 


264 


SEQ ID NO: 1337 


cttcataagttcaatgaat 


13176 


13195 2 


4 


SEQ ID NO: 335 


gactgctgattcaagaagt 


308 


327 


SEQ ID NO: 1338 


acttcccaactctcaagtc 


13407 13426 2 


4 


SEQ ID NO: 336 


ttgctgcagccatgtccag 


475 


494 


SEQ ID NO: 1339 


ctgggcagctgtatagcaa 


5881 


5900 2 


4 


SEQ ID NO: 337 


agaaagatgaacctactta 


547 


566 


SEQ ID NO: 1340 


taagtatgatttcaattct 


10490 10609 2 


4 
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1*3.0. 
OOO 


9^% 0% ^ V^%^^#H «^ 


1 HOT 




ceo in Kio< 
ocU IU (NU. 


104 1 


agucaaigaauiauca 


10 100 


102U2 ^ 


A 
*T 




HQ 


aicicicugccacag^^g 


iono 




ceo in MO- 
ocU IU INU. 


104Z 


cagcccagccaiugagai 


9229 


9240 2 


A 

4 


ccr^ in MO* 


O^V 


iciciCnQCcacagciya 






ceo in MO' 
ocU IU INU 


T04O 


vcagcccagccaingaga 


9220 


924/ 2 


A 

4 




341 


tgaggtgtccagccccatc 






oco m MO. 
OcU ID NU 


|044 


gaigggaaagccgccctca 


O20o 


5227 2 


A 

4 


ceo in Ktr\» 


o42 


ccagaactcaagicttcaa 


io^u 


lD3y 


eco m kio 
ocU IU NU 


1040 


ugaaagcagaacctctgg 


5907 


5920 2 


A 

4 


ccr\ in KiO' 


o4o 


ctgaaaaagnagtgdaag 




lyou 


oco irv K.IO 
OcU lU NU 


1o4v) 


cntcicgggaataucag 


1Uo2o 


10o42 2 


4 


ocU lU INU. 




tttttcccagacagtgtca 




2257 


oco ir^ MO 
ocU IU NU 


104/ 


igacaggcattttgaaaaa 


9722 


9741 2 


4 


obU lU INU. 




itttcccagacagTQtcaa 




01CD 

220O 


ceo in iL.io< 
OcQ IU NU. 


. jo4o 


ngacag gcattttgaaaa 


9721 


Q7>l A 0 

9740 2 


4 


oc<^ in Kio» 
obU ID iNU. 


o4o 


caticagaacaagaaaau 




o414 


oco ir\ Kio. 

ocU IU NU: 


io4y 


aauccaattttgagaatg 


10406 


10425 2 


4 


ccn in Mr>' 
%>cU lU INU. 


OHf 


tgaagagaagangaain 


OOZU 


00 oy 


ceo in MO' 
ocU IU NU 


, lOOU 


9494^%V9Vt#«bd& 911 At JM^Xjfc»^ 

aaatgtcagctctigitca 


10094 


1091 0 2 


A 

4 


ceo in MO- 
QCU lU INW. 


OHO • 


uigaaiggaacacaggca 


OOOO 




ceo in MO< 
oCU IU NU 


. 1001 


^ M 94 94 9^ 9>ft 44 4 94 #4 9H 9^ 94 94 94 94 94 94 

igccagnigaaaaacaaa 


lloOf 


11o2o 2 


4 


ceo in MO* 
ocU lU INU. 




nctagaxtcg aatatca a 


•i-oyy 




ceo in MO 
ocU IU NU 


. 1002 


ttgacatgttgataaagaa 


7ooy 


7ooo 2 


A 

4 


CEO m MO- 

OCU lU NU. 


OOU 


gattcg aataicaaatica 




4423 


eco in MO 
OcU IU NU 


: lOOo 


49V9V9%9ti494 9to 94 ^&^k94 ^K^K^B ^s4^h 

tgaagtagaccaacaaaic 


7154 


'74'7'i 0 

7170 2 


4 


ceo in MO' 




cgcaacgaccaacicgaag 




ouy4 


ceo in MO 
ocU IU NU 


, 1004 


#%4t/49^9*I/%449\9\944949449^ ^4^% 

ciicaggitccatcgtgca 


1 lO/O 


11090 2 


4 


cpn in wo* 




iiaagcicicaaaigacai 


OOl / 


0000 


ceo in MO 
ocU IU NU 


1000 


aigiigaiaaagaaanaa 


/Of 4 


/ 090 2 


4 


ceo in MO* 

OCU IU INU. 




caainaacaacaaigaai 


ouoo 


OUoO 


ceo in MO 
QCU IU NU 


1000 


9^ T 9^ 94 94 4 9V 94 9^494 4 94 449M 

aucaaactgcciaiaitg 


lOooo 


1 OoO r 2 


4 


ceo in MO- 

OuU lU iNU. 




igaaiacagccaggaciig 


DUOU 


Duyy 


ceo in MO 
OcU IU INU 


loo I 


94 ^^^4 ^t^ft ^19^^^9494 9494^^494 4494 94 

caagagcacacggicnca 


lUo/y 


lUoyo 2 


4 


ceo in MO* 




catcaaiattgatcaain 


o41o 


04o2 


oco m MO 
ocU lU NU 


. looo 


9%^49444'.«m 94 a«49H «49taA4 9B 9* 

aaattccctgaagngatg 


11478 


AAA tS"! 0 

11497 2 


4 


ceo in MO* 

OIIU lU INU. 


JOO 


ugagcaigicaaacacu 


TORI 




ceo in MO 
ocU IU iNU 


, rooy 


94 94 4 94 •■4 94 494 #4 4^4 94 94 4494 94 94 

aagiaagtgctag g itcaa 


90/0 


9092 2 


4 


cpo in 


•99 r 


igaaggagaciaucagaa 






CCA in MO 
ocU lU NU 


, noou 


4494T9^i94J^949%9<t 94 A '^Ttf^TTtf^ A 

ucigcacagaaaiauca 


1040O 


1040/ ^ 


4 


ceo in MO- 
ocu IU INU. 




Xicaggcicticagaaagc 




/y4u 


ceo in MO 
ocU IU NU, 


, Tool 


gcngctaacctctctgaa 


12004 


12o2o 2 


4 


ceo in MO* 
ocU jU INU. 


JO? 


iccacaaattgaacatccc 




o/yo 


QCO ir^ MO 

OCU IU NU. 


1oo2 


gggacctaccaagagtgga 


12o2o 


12544 2 


4 


ceo in MO' 
QCU IU NU. 


OQU 


igaaiaccaatgcigaaci 


iuioy 


lU17o 


ceo in MO 
OcU IU NU 


1000 


agttcaatgaatttattca 


•101 DO 

131 00 


13202 2 


4 


ceo in MO- 
OCU IU INU. 




laaactaaiagaigtaatc 


i^oyu 


i2yuy 


oco in Mo< 
ocU IU NU. 


1004 


gattactatgaaaaatua 


1ooo2 


10o51 2 


4 


GCO in MO' 
OCU lU INU, 




ngaccigiccaucaaaa 




looyi 


ceo in MO 
ocU IU NU 


lOOO 


44449^9% 94 949M 9494 94494449494 94 

tutaaaag aaatcncaa 


lOouO 


1oo24 2 


4 


ceo in MO- 
ocU IU INU. 


00 O 


gggcigagigccciicicg 


1 T 


OA 
OU 


ceo in MO 
ocU IU NU 


1000 


n 9^9*^94 94 94 94 9W MM 949494 9494 94 9^94 #4 

cgaggccaggccgcagccc 


/O 


90 1 


4 


ceo in MO' 




g gc vy agig cccuc vcg g 


HO 


on 


ceo in MO 
OCU IU INU 


.100/ 


94#49V94 9*t9^ 949494 94 iFt^%0^^^^ #^94 9494 

ccgaggccaggccgcagcc 




y4 1 


A 

4 


ceo in wo- 

OCViC lU INU. 




Ctgag igcccncxcg gn 




00 


ceo in MO 
OcU IU NU 


1000 


9%94 94 949^^9W 949^49^1 <^ 94 4 94494 rf^ 9^1 

aaccQ igccig aa icicag 


1 1049 


llOOo 1 


4 


ceo in MO' 

OCU IU INU. 




iCicggiigcigccgcTga 


^0 


44 


ceo in MO 

OCU IU NU 


looy 


49\9^ #4 94^^ 94 9494^94 94^94^ 9^ 

icagcigaccicaicgaga 


210U 


0*170 *! 
£. \ / 9 1 


4 


Qco in MO» 

OCU IU INU. 




caggccgcagcccaggagc 






ceo in MO 
ocU IU NU 


, lo/u 


9*1 94^94^9^94 94 94 944 4#4 94 4#%944^ 

gcictgcagcncaiccig 


000 


00/ 1 


4 


CPO in MO* 

OCU lU INU. 


OOO 


gc^gcgcigccigcgcig 




'I AO 


ceo in MO 

OCU IU INU 


. 10/1 


9494 9494949494949^94949444494 A 9^ 94 

ca gcacagaccauicagc 


4244 


42O0 1 


4 


SEQ ID NO: 


369 


tgctgctggcgggcgccag 


169 


188 


SEQ ID NO 


: 1372 


ctggatgtaaccaccagca 


11178 


11197 1 


4 


SEQ ID NO: 


370 


ctggtctgtccaaaagatg 


219 


236 


SEQ ID NO 


: 1373 


catcctgaagaccagccag 


380 


399 1 


4 


SEQ ID NO: 


371 


ctgagagttccagtggagt 


283 


302 


SEQ ID NO 


:1374 


actcaccctggacattcag 


3383 


3402 1 


4 


SEQ ID NO: 


372 


tccagtggagtccctggga 


291 


310 


SEQ ID NO 


: 1375 


tcccggagccaaggctgga 


2675 


2694 1 


4 


SEQ ID NO: 


373 


aggttgagctggaggttcc 


346 


365 


SEQ ID NO 


: 1376 


ggaaccctctccctcacct 


4728 


4747 1 


4 


SEQ ID NO: 


374 


tgagctggaggttccccag 


350 


369 


SEQ ID NO 


: 1377 


ctgggaggcatgatgctca 


9163 


9182 1 


4 


SEQ ID NO: 


376 


tctgcagcttcatcctgaa 


370 


389 


SEQ ID NO 


: 1378 


ttcaaatataatcggcaga 


3261 


3280 1 


4 
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ocU lU iMU. 




gccagi^cscccigaaay^ 


oy*t 


A-i 1 


CPO in MO" 1'*7Q 

ocW lU IMU. 10/ y 
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Df 04 
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A 
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Ol f 
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A 

4 
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/I 
4 
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4 
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1 1U4Z 
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A 
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OPU IU OIU. lOOQ 


ligalUIoaCaaaSgigg 


DO 1 / 


DOOO 1 


A 
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ocu IU INU. ioyo 


gaigoigssosgigsgotg 


0144 


Q4ftO <1 
OIOO 1 


4 


^po in wo* 


oy*» 


9010X03103339903193 


f 00 


ouo 


CPO in MO' ^Q.Q7 

oPU IU INU. loyr 


4m^^^ ^m^m4^m^m4mmm M 

1031330391301919390 


•1 ni^7 
1U00/ 


lUOOD 1 


4 


Qpo In wo* 

OCU lU INU. 


oyo 


0011910330101931039 


ft*! i 


oou 


CPO in MO* iQOft 

oPw IU INU. loyo 


0x93919991113103399 


jZ440 


•10>lftil i 
1Z404 1 


A 

4 


QPO in MO' 
OCU IU IMU. 


oyo 


MTTfVTM'^ ^^^%TMr^V ^%^M M^^^% 

0119103301019310390 


mo 


oOl 


ceo in MO' iQQQ 

opu IU INU. foyy 


#v m4#v m m4#^#^ #v444^%4 

90i9a9tgg9tn3XC33g 


IZ444 


1Z400 1 


4 


ceo in MO* 

ocU IU IMU. 


oyr 


A M MM M^M^M MMM M #V ^ ^ AM M 

39008X0190339939^33 




on*) 

yuo 


CPO in MO* AAt\r\ 
obU IU NU. 14UU 


ugosaigagcioatggct 


OOUO 


0o24 1 


4 


ceo in MO' 
ocU lU IMU. 


oyo 


900310190339939(^330 


ooO 


yu4 


ceo in MO- 4>in«i 
obU IU NU. 14ui 


gugcaatgagcioatggo 


OoU4 


OoZo 1 


4 


cpo in MO* 

OCU lU IMU. 


oyy 


M^^MM^^MMf 44M^MMfrMM 

oiiooigooxxxcxocxao 


onn 
yuo 


097 

yZf 


CPO in MO' iyfno 
oPU IU NU. 14UZ 


m4mmm mm4mmm4m^v^m M M M 

9x39933x33319939339 


OA 

y4oo 


y4/z 1 


4 


CPO in MO* 

oCvil lU IMU. 


Ann 


0111010013033933133 


yio 


yoo 


CPO in MO' AAf\*i 

OPU lU NU. 14U0 


113x190193310033339 


10040 


<lQftfi7 4 
lOODf 1 


A 

4 


QPO in MO* 
OCU IU INU. 


AHA 


gsioaacsgocgoxxoui 


ORQ 

yoy 


•I nnn 
lUUO 


CPO in MO' ^Af\A 

oPw IU NU. i4U4 


M M MM MM m4m M m4m M 4^V m4m 

3339003103019319310 


10D1 


IOOU 1 


4 


CPO in wo> 

OCU IU INU. 


Atyo 


axcaaoagocgouoxxig 


Qon 

99U 


<innQ 

1UUS9 


CPO in MO* ^AfSUL 
OPU IU NU. 14U0 


MM<a*^MMM^4M*^M4^*^4^<Q4 

0333900310301931931 


100U 


lor y 1 


• 

A 

4 


CPO in MO* 

OCU IU INU. 


H\JO 


0039009011011199^9 3 


QQA 

yy*i 


i nf) 
1 UTO 


CPO in MO* i^na 

OPU IU NU. 14UO 


4mmmmmm4mm444mmm4m4 

103033310011199^^9^ 


yoo/ 


yooo 1 


4 


CPO in MO* 

OtiU IU INU. 


AHA 


339 319 99c ^^^9^31X19 


1 UZO 


-1 nAO 

1U4Z 


CPO in MO' 'i/tn7 
OPU IU NU. 14Ur 


MM mmm4mm m mmm^^mm4m44 

03333X39 3 399 933IOXX 


zuoy 


ZUoo 1 


4 


CPO in MO* 

ocU IU IMU. 




1911119 33930 1010039 


1U0<£ 


i ini 
1 1U1 


CPO in MO* 'fyinfi 
OPU lU NU. 14Uq 


MTM m4m^%m4mm444m#%^%mm 

0X99133013011133303 


Ryf fi7 


OOUO 1 


4 


CPO in MO" 

OCW IU INU. 


4U0 


1193393010100399330 


i nflA 
lUuo 


4 inR 
1 lUO 


CPO in MO* ^Ano 
oPW IU NU. j4Uy 


^^^^^^^^ M m444m44mm M 

9x1033x933111311033 


10104 


lOZUu 1 


4 


SEQ ID NO: 


407 


3aotg333aaaotaaoo3t 


1102 


1121 


SEQ ID NO: 1410 


8tggc3ttttttgo33gtt 


14006 


14025 1 


4 


SEQ ID NO: 


408 


otg33333aotaaooatot 


1104 


1123 


SEQ ID NO: 1411 


3g3ttgatgg9039tto3g 


4564 


4583 1 


4 


SEQ ID NO: 


409 


aaaactaaccatototgag 


1109 


1128 


SEQ ID NO; 1412 


Ctca3a9a3tg30tttttt 


2570 


2589 1 


4 


SEQ ID NO: 


410 


tgagcaaaat3tooag3g3 


1124 


1143 


SEQ ID NO: 1413 


totooagataaaaaaotoa 


12201 


12220 1 


4 


SEQ ID NO: 


411 


oaataagotggttactgsg 


1154 


1173 


SEQ ID NO: 1414 


ctc3g3tc33agtt33ttg 


12266 


12284 1 


4 


SEQ ID NO: 


412 


taotgagotgagaggootc 


1166 


1185 


SEQ ID NO: 1415 


gagggtagtoataaoagta 


10329 


10348 1 


4 


SEQ ID NO: 


413 


gootoagtg3tg3a9C3gt 


1180 


1199 


SEQ ID NO: 1416 


aotgttgaotcaggasggc 


12572 


12591 1 


4 
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^IcQ ID NO: 414 


agtcacatctctcttgcca 


1196 


1215 


SEQ ID NO: 1417 


tggccacatagcatggact 


8858 


8877 1 


4 


5EQ ID NO: 415 


atctctcttgccacagctg 


1202 


1221 


SEQ ID NO: 1418 


cagctgacctcatcgagat 


2161 


2180 1 


4 


SEQ ID NO: 416 


tctctcttgccacagctga 


1203 


1222 


SEQ ID NO: 1419 


tcagctgacctcatcgaga 


2160 


2179 1 


4 


SEQ ID NO: 417 


tgccacagctgattgaggt 


1210 


1229 


SEQ ID NO: 1420 


acctgcaccaaagctggca 


13955 


13974 1 


4 


SEQ ID NO: 418 


gccacagctgattgaggtg 


1211 


1230 


SEQ ID NO: 1421 


caccaaaaaccccaatggc 


11240 


11259 1 


4 


SEQ ID NO: 419 


teactttacaagccttggt 


1240 


1259 


SEQ ID NO: 1422 


acx^gatgctgaacagtga 


8140 


8159 1 


4 


SEQ ID NO: 420 


cccttctgatagatgtggt 


1324 


1343 


SEQ ID NO: 1423 


accacttacagctagaggg 


10816 


10835 1 


4 


SEQ ID NO: 421 


gtcacctacctggtggccc 


1341 


1360 


SEQ ID NO: 1424 


gggcgacctaagttgtgac 


3431 


3450 1 


4 


SEQ ID NO: 422 


ccttgtatgcgctgagcca 


1432 


1451 


SEQ ID NO: 1425 


tggctggtaacctaaaagg 


6578 


5597 1 


4 


SEQ ID NO: 423 


gacaaaccctacagggacc 


1472 


1491 


SEQ ID NO: 1426 


ggtcctttatgattatgtc 


12347 


12366 1 


4 


SEQ ID NO: 424 


tgctaattacctgatggaa 


1508 


1527 


SEQ ID NO: 1427 


ttcccaaaagcagtcagca 


9930 


9949 1 


4 


SEQ ID NO: 425 


tgactgcactggggatgaa 


1638 


1557 


SEQ ID NO: 1428 


ttcaggtccatgcaagtca 


10909 10928 1 


4 


SEQ ID NO: 426 


actgcactggggatgaaga 


1540 


1559 


SEQ ID NO: 1429 


tcttgaacacaaagtcagt 


5999 


6018 1 


4 


SEQ ID NO: 427 


atgaagattacacctatU 


1552 


1571 


SEQ ID NO: 1430 


aaatgaaagtaaagatcat 


8110 


8129 1 


4 


SEQ ID NO: 428 


accatggagcagttaactc 


1602 


1621 


SEQ ID NO: 1431 


gagtaaaccaaaacttggt 


9016 


9035 1 


4 


SEQ ID NO: 429 


gcagttaactccagaactc 


1610 


1629 


SEQ ID NO: 1432 


gagttactgaaaaagctgc 


13719 


13738 1 


4 


SEQ ID NO: 430 


cagaactcaagtcttcaat 


1621 


1640 


SEQ ID NO: 1433 


attggatatccaagatctg 


1925 


1944 1 


4 


SEQ ID NO: 431 


caggctctgcggaaaatgg 


1695 


1714 


SEQ ID NO: 1434 


ccafgacctccagctcctg 


2477 


2496 1 


4 


SEQ ID NO: 432 


ccaggaggttcttcttcag 


1730 


1749 


SEQ ID NO: 1435 


ctgaaatacaatgctctgg 


5611 


5530 1 


4 


SEQ ID NO: 433 


ggttcttcttcagactttc 


1736 


1755 


SEQ ID NO: 1436 


gaaaaacttggaaacaacc 


4431 


4460 1 


4 


SEQ ID NO: 434 


tttccttgatgatgcttct 


1751 


1770 


SEQ ID NO: 1437 


agaatccagatacaagaaa 


6885 


6904 1 


4 


SEQ ID NO: 435 


ggagataagcgactggctg 


1773 


1792 


SEQ ID NO: 1438 


cagcatgcctagtttctcc 


9944 


9963 1 


4 


SEQ ID NO: 436 


gctgcctatcttatgttga 


1788 


1807 


SEQ ID NO: 1439 


tcaatatcaaaagcccagc 


12037 


12056 1 


4 


SEQ ID NO: 437 


actttgtggottcccatat 


1882 


1901 


SEQ ID NO: 1440 


atatctggaaccttgaagt 


10729 


10748 1 


4 


SEQ ID NO: 438 


gccaatatcttgaactcag 


1902 


1921 


SEQ ID NO: 1441 


ctgaactcagaaggatggc 


13992 


14011 1 


4 


SEQ ID NO: 439 


aatatcttgaactcagaag 


1905 


1924 


SEQ ID NO: 1442 


cttccattctgaatatatt 


13370 13389 1 


4 


SEQ ID NO: 440 


ctcagaagaattggatatc 


1916 


1935 


SEQ ID NO: 1443 


gataaaagattactttgag 


7265 


7284 1 


4 


SEQ ID NO: 441 


aagaattggatatccaaga 


1921 


1940 


SEQ ID NO: 1444 


tcttcaatttattcttctt 


13817 


13836 1 


4 


SEQ ID NO: 442 


agaattggatatccaagat 


1922 


1941 


SEQ ID NO: 1445 


atcttcaatttattcttct 


13816 


13835 1 


4 


SEQ ID NO: 443 


tggatatccaagatctgaa 


1927 


1946 


SEQ ID NO: 1446 


ttcacataccagaattcca 


8317 


8336 1 


4 


SEQ ID NO: 444 


atafccaagatctgaaaaa 


1930 


1949 


SEQ ID NO: 1447 


tttttaaccagtcagatai 


10177 


10196 1 


4 


SEQ ID NO: 445 


tatccaagatctgaaaaag 


1931 


1950 


SEQ ID NO: 1448 


ctttttaaccagtcagata 


10176 


10195 1 


4 


SEQ ID NO: 446 


caagatctgaaaaagttag 


1935 


1954 


SEQ ID NO: 1449 


ctaaattcccatggtcttg 


4965 


4984 1 


4 


SEQ ID NO: 447 


aagatctgaaaaagttagt 


1936 


1955 


SEQ ID NO: 1450 


actaaattcccatgglctt 


4964 


4983 1 


4 


SEQ ID NO: 448 


tgaaaaagttagtgaaaga 


1942 


1961 


SEQ ID NO: 1451 


tctttctcgggaatattca 


10622 


10641 1 


4 


SEQ ID NO: 449 


tccaactgtcatggacttc 


1982 


2001 


SEQ ID NO: 1452 


gaagcacatatgaactgga 


13937 


13956 1 


4 


SEQ ID NO: 450 


tcagaaaattctctcggaa 


1999 


2018 


SEQ ID NO: 1463 


ttcctttaacaattcctga 


9493 


9512 1 


4 


SEQ ID NO: 451 


ttccatcacttgacccagc 


2044 


2063 


SEQ ID NO: 1454 


gctgacatagggaatggaa 


8433 


8452 1 


4 
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otiU lU iMvj. 402 


cccagcctcagccaaaata 


2057 


2076 


SEQ ID NO: 1455 


tattctatccaagattggg 


7812 


7831 


1 


4 


ocU lU NO: 453 


agcctcagccaaaatagaa 


2060 


2079 


SEQ ID NO: 1456 


ttctatccaagattgggct 


7814 


7833 


1 


4 




atcttatatngatccaaa 


2083 


2102 


^M ■ M^ B M MM. .A « 

SEQ ID NO: 1457 


tttgaaaaacaaagcagat 


11813 


11832 1 


4 


\>C\U lU fHKJ, HOO 


tcttatatttgatccaaat 


2084 


2103 


SEQ ID NO: 1458 


attttttgcaagttaaaga 


14011 


14030 1 


4 


QPO in MO> /tea 


cttcctaaagaaagcatgc 


2109 


2128 


SEQ ID NO: 1459 


gcatggcattatgatgaag 


3606 


3625 


1 


4 


QCW lU IMC/. HOf 


ctaaagaaagcatgctgaa 


2113 


2132 


SEQ ID NO: 1460 


ttcagggtgtggagtttag 


5686 


5705 


1 


4 


ocU lU IMU. 4oo 


taaagaaagcatgctgaaa 


2114 


2133 


SEQ ID NO: 1461 


tttcttaaacattccttta 


9482 


9501 


1 


4 


QCi^ in MO' /ICQ 


gagattggcttggaaggaa 


2175 


2194 


SEQ ID NO: 1462 


ttccctccattaagttctc 


11701 


11720 1 


4 




ctttgagccaacattggaa 


2198 


2217 


^^B^^ ■ ft. ft J ^ — — 

SEQ ID NO: 1463 


ttccaatgaccaagaaaag 


11060 


11079 1 


4 


ceo in Kio. vicH 
okU \U iMU. 4ol 


cagacagtgtcaacaaagc 


2245 


2264 


SEQ ID NO: 1464 


gcttactggacgaactctg 


6134 


6153 


1 


4 


ccn in MO* Ao*^ 
ocQ lU IMU: 462 


cagtgtcaacaaagctttg 


2249 


2268 


SEQ ID NO: 1465 


caaattcctggatacactg 


9849 


9868 


1 


4 


olzW lU IMU. 4oo 


agtgtcaacaaagctttgt 


2250 


2269 


SEQ ID NO: 1466 


acaagaatacgtctacact 


4351 


4370 


1 


4 


Qpn in K\rs' AdA 

OCVfJ lU INU. 4o4 


ctgatggtgtctctaaggt 


2290 


2309 


SEQ ID NO: 1467 


acctcggaacaatcctcag 


3325 


3344 


1 


4 


QCO in KIO« AdG. 

OCU lU IMU. 400 


tgatggtgtctctaaggtc 


2291 


2310 


SEQ ID NO: 1468 


gacctgcgcaacgagatca 


8823 


8842 


1 


4 


QCri in MO* AGA 
ocU lU IMU. 400 


aaacatgagcaggatatgg 


2343 


2362 


■» t ft i M .a Mk Mk 

SEQ ID NO: 1469 


ccatgatctacatttgttt 


6788 


6807 


1 


4 


<^PO in KID* AA7 
OCU IL^ IMU. 40 r 


gaagctgattaaagatttg 


2387 


2406 


SEQ ID NO: 1470 


caaaaacattttcaacttc 


5279 


5298 


1 


4 


QPn in wo- zRfi 

OCW lU IMU. 'fOO 


aaagatttgaaatccaaag 


2397 


2416 


SEQ ID NO: 1471 


ctttaagttcagcatcttt 


7606 


7625 


1 


4 


ocU lU IMU. 4d9 


gatgggtgcccgcactctg 


2510 


2529 


SEQ ID NO: 1472 


cagatttgaggattccatc 


7975 


7994 


1 


4 


ocU lU iMU. 4/U 


gggatcccccagatgattg 


2532 


2551 


ft 0^ & ft ^ M ^M^M 

SEQ ID NO: 1473 


caatcacaagtcgattccc 


9075 


9094 


1 


4 


ceo in MO. jith 
otU \\J NU: 471 


ttttcttcactacatcttc 


2585 


2604 


MM MMa .aa^ a aa^ a a ^ & 

SEQ ID NO: 1474 


gaagtgtcagtggcaaaaa 


10374 10393 1 


4 


QCO in KIO» >1T«5 

ocU lU iMU. 4/2 


tcttcactacatcitcatg 


2588 


2607 


SEQ ID NO: 1475 


catggcattatgatgaaga 


3607 


3626 


1 


4 


QCO in MO* yl7Q 

ocU lU IMU. 4/0 


tacatcttcatggagaatg 


2595 


2614 


SEQ ID NO: 1476 


cattatggaggcccatgta 


9437 


9456 


1 


4 


QPO in KIO* ATA 

ocU lU IMU. 474 


ttcatggagaatgcctttg 


2601 


2620 


^^ft a a. a mm .a a ^^b^^b 

SEQ ID NO: 1477 


caaaatcaactttaatgaa 


6599 


6818 


1 


4 


QPn in wo* a^Kl 

OCVit IL/ IMU. *ff O 


tcatggagaatgcctttga 


2602 


2621 


SEQ ID NO; 1478 


tcaacacaatcttcaatga 


13108 


13127 1 


4 


QPO in MO* ATR 
OCVil lU IMU. 4/0 


utgaactccccactggag 


2616 


2635 


SEQ ID NO: 1479 


ctccccaggacctttcaaa 


9834 


9853 


1 


4 


Qpn in MO* 
ocw lu IMU. m f 


ugaactccccactggagc 


. 2617 


2636 


SEQ ID NO: 1480 


gctccccaggacctttcaa 


9833 


9852 


1 


4 


QPO in MO" A7Q. 

OCU \u IMU. 4f o 


tgaactccccactggagci 


2618 


2637 


1 ft. ■ ^ ^ MM .A 

SEQ ID NO: 1481 


agctccccaggacctttca 


9832 
9336 * 


9851 


1 


4 


QPO in MO- A7Q 
lU IMU. ^(J9 


cactggagctggattacag 


^^^^ 
2627 


2646 


SEQ ID NO: 1482 


ctgtttctgagtcccagtg 


9355 


1 


4 


QPO in MO* AAH 
OCU lU IMU. 40U 


actggagctggattacagt 


2628 


2647 


SEQ ID NO: 1483 


actgtttctgagtcccagt 


9335 


9354 


1 


4 


QPO in MO' yim 
ocu lU IMU. 40l 


agttgcaaatatcttcatc 


2644 


2663 


f 1 ft ■ M M ^M a 

SEQ ID NO: 1484 


gatgatgccaaaatcaact 


6591 


6610 


1 


4 


ocU lu NU: 4o2 


gttgcaaatatcttcatct 


2645 


Mb A^ M 

2664 


SEQ ID NO: 1485 


agatgatgccaaaatcaac 


6590 


6609 


1 


4 


ceo in Kii^» x o<3 
otU lU NU. 4oo 


aaatatcttcatctggagt 


2650 


il^ MM ^M MM 

2669 


SEQ ID NO: 1466 


actcagaaggatggcattt 


13996 


14015 1 


4 


ocQ ID NO: 484 


taaaactggaagtagccaa 


2695 


2714 


SEQ ID NO: 1487 


ttggttacaggaggcttta 


7592 


7611 


1 


4 


SEQ ID NO: 485 


ggctgaactggtggcaaaa 


2720 


2739 


SEQ ID NO: 1488 


ttttcttttcagcccagcc 


9220 


9239 


1 


4 


SEQ ID NO: 486 


tgtggagtttgtgacaaat 


2750 


2769 


SEQ ID NO: 1489 


attttcaagcaaatgcaca 


8530 


8549 


1 


4 


SEQ ID NO: 487 


ttgtgacaaatatgggcat 


2758 


2777 


SEQ ID NO: 1490 


atgcgtctaccttacacaa 


9513 


9532 


1 


4 


SEQ ID NO: 486 


atgaacaccaacttcttcc 


2811 


2830 


SEQ ID NO: 1491 


ggaagctgaagtttatcat 


2869 


2888 


1 


4 


SEQ ID NO: 489 


cttccacgagtcgggtctg 


2825 


2844 


SEQ ID NO: 1492 


cagagctatcactgggaag 


5227 


5246 


1 


4 
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SEQ ID NO: 


490 


gagtcgggtctggaggctc 


2832 


2851 


^^I^^X 1 r% hl^N. il >iAO 

SEQ ID NO: 1493 


gagcttactggacgaactc 


A ^ A A 

6132 


A ^ A A 

6151 1 


A 

4 


SEQ ID NO: 


491 


cctaaaagctgggaagctg 


A n p n 

2858 


2877 


SEQ ID NO: 1494 


cagcctccccagccgtagg 


12112 


A ^\ A M A A 

12131 1 


4 


SEQ ID NO: 


492 


agctgggaagctgaagttt 


2864 


A n O A 

2883 


SEQ ID NO: 1495 


—ft _ t a _ AAX A 

aaactgttaatttacagct 


5455 


tf* A A A 

5474 1 


4 


SEQ ID NO: 


493 


ccagattagagctggaact 


3106 


3125 


SEQ ID NO: 1496 


agtUccggggaaacctgg 


12718 


12737 1 


4 


SEQ ID NO: 


494 


ggataccctgaagtttgta 


3200 


3219 


SEQ ID NO: 1497 


M _ ^ • AA A. A 

tacagtattctgaaaatcc 


A A A A 

8385 


A A A A A 

8404 1 


4 


SEQ ID NO: 


496 


ctgaggctaccatgacatt 


3244 


3263 


SEQ ID NO: 1498 


A At ft 

aatgagctcatggcttcag 


3809 


A A^^ A ^ 

3828 1 


4 


SEQ ID NO: 


496 


tgtccagtgaagtccaaat 


3289 


3308 


SEQ ID NO: 1499 


J. ft 1 

attttgagaggaatcgaca 


ift 

6349 


6368 1 


4 


SEQ ID NO: 


497 


aattccggattttgatgtt 


3305 


M M A A 

3324 


^^^^^^ 1 ^ ^ A A 

SEQ ID NO: 1500 


Am. lm. 

aaoacatgaateacaaatt 


^■1 tfft #k jft 

8930 


8949 1 


4 


SEQ ID NO: 


498 


ttccggattttgatgttga 


3307 


3326 


SEQ ID NO: 1501 


tcaaaacgagcttcaggaa 


^ A ^ A A 

13199 


13218 1 


A 

4 


SEQ ID NO: 


499 


cggaacaatcctcagagtt 


3329 


A4 Jl O 

3348 


SEQ ID NO: 1502 


aacttgtacaactggtccg 


A A AO 

4203 


^ AAA A 

4222 1 


A 

4 


SEQ ID NO: 


500 


tcctcagagttaatgatga 


3337 


r*A 

3356 


A^^N ir^ iLi^^. at f oo 

SEQ ID NO: 1503 


tcatcaattggttacagga 


7585 


7604 1 


A 

4 


SEQ ID NO: 


501 


ctcaccctggacattcaga 


3384 


O il AO 

3403 


Af^N I^N ILI^N A ^ f\ A 

SEQ ID NO: 1504 


A A X A 

tctgcagaacaatgctgag 


^ A A A ^ 

12431 


^ A A A A 

12450 1 


A 

4 


SEQ ID NO: 


502 


cattcagaacaagaaaait 


3395 


^ A A A 

3414 


SEQ ID NO: 1505 


AA& a. i 

aattgactttgtagaaatg 


8096 


A ^ A ^ 

8115 1 


4 


SEQ ID NO: 


n A 

503 


actgaggtcgccctcatgg 


3414 


3433 


SEQ ID NO: 1506 


A A A 

ccatgcaagtcagcccagt 


10916 


A 0\ A A ^ A 

10935 1 


4 


SEQ ID NO: 


504 


AA AAA A A«A 

ttantccataccccgttt 


3478 


3497 


SEQ ID NO: 1507 


M. 1. ft ft ft. ft 

aaactgcctataftgataa 


13872 


13891 1 


4 


SEQ ID NO: 


505 


gtttgcaagcagaagccag 


O il oo 

3493 


O 4 A 

3512 


SEQ ID NO: 1508 


ctggacttctcttcaaaac 


^ A A 

5400 


^ ^ il A A 

5419 1 


it 
4 


SEQ ID NO: 


506 


tttgcaagcagaagccaga 


3494 


3513 


SEQ ID NO: 1509 


tctgggtgtcgacagcaaa 


^ A A ^ 

5264 


^ A A A 

5283 1 


A 

4 


SEQ ID NO: 


507 


ttgcaagcagaagccagaa 


3495 


3514 


SEQ ID NO: 1510 


AM. ft. «. ft. 

ttctgggtgtcgacagcaa 


5263 


5282 1 


4 


SEQ ID NO: 


508 


ctgcttctccaaatggact 


O il/> 

3546 


3565 


Af~^> i r% kKN. ^ P il 

SEQ ID NO: 1511 


agtcaagattgatgggcag 


A f A 

4559 


A A ^ 

4578 1 


A 

4 


SEQ ID NO: 


509 


tgctacagcttatggctcc 


3569 


3588 


P^V*^^ ■ P^ ft ^ t« A ^ft 

SEQ ID NO: 1512 


ft.J.ft- Aft. 

ggaggctttaagttcagca 


7601 


7620 1 


4 


SEQ ID NO: 


510 


AA A A 

acagcttatggctccacag 


3573 


3592 


SEQ ID NO: 1513 


t ft ft M t t A 

ctgtatagcaaattcctgt 


5689 


5908 1 


4 


SEQ ID NO: 


511 


tttccaagagggtggcatg 


3592 


3611 


Al**/^ 11^ kl^N> A A A 

SEQ ID NO: 1514 


catggacttcttctggaaa 


A A A A 

8869 


A A A A A 

8888 1 


4 


SEQ ID NO: 


512 


ccaagagggtggcatggca 


3595 


OA^ A 

3614 


SEQ ID NO: 1515 


tgcccagcaagcaagttgg 


AO^O 

9353 


AO*VA il 

9372 1 


il 
4 


SEQ ID NO: 


P A n 

513 


gtggcatggcattatgatg 


3603 


^ A AA 

3622 


SEQ ID NO: 1516 


catccttaacaccttccac 


A A A A 

8063 


A A A A A 

8082 1 


4 


SEQ ID NO: 


514 


tgatgaagagaagattgaa 


oo<i ^ 

3617 


A AO A 

3636 


SEQ ID NO: 1517 


ttcactgttcctgaaatca 


*V AAO 

7863 


■f OO A A 

7882 1 


A 

4 


SEQ ID NO: 


515 


gaagagaagattgaatttg 


3621 


3640 


SEQ ID NO: 1518 


caaaaacattttcaacttc 


5279 


5298 1 


4 


SEQ ID NO: 


516 


gagaagattgaatttgaat 


3624 


0^?> il o 

3643 


SEQ ID NO: 1519 


attcataatcccaactctc 


O A*V A 

8270 


O AO A A 

8289 1 


il 
4 


SEQ ID NO: 


C A'^ 

517 


tttgaatggaacacaggca 


3636 


3655 


SEQ ID NO: 1520 


t . aaa-^a a ^ 

tgcctttgtgtacaccaaa 


A A AAA 

11228 


11247 1 


4 


SEQ ID NO: 


518 


aggcaccaatgtagatacc 


3650 


3669 


SEQ ID NO: 1521 


ggtaacctaaaaggagcct 


^ ^ A A 

5583 


A A A 

5602 1 


A 

4 


SEQ ID NO: 


^An 

519 . 


caaaaaaatgacttccaat 


3668 


A A A^ 

3687 


P^P^^% 1 ftt^V ^ ^AA 

SEQ ID NO: 1522 


11 X A XJftl 

attgaagtacctacttttg 


A A ^ A 

8358 


8377 1 


4 


SEQ ID NO: 


520 


aaaaaaatgacttccaatt 


3669 


A A A A 

3688 


^^l**^^ ft 1.^^^ A ^AA 

SEQ ID NO: 1523 


^ M A . ^ A A ^ X J. XA 

aattgaagtacctactttt 


A A 

8357 


A A ^VA A 

8376 1 


4 


SEQ ID NO: 


521 


aaaaaatgacttccaattt 


3670 


3689 


SEQ ID NO: 1524 


aaatccaatctcctctttt 


8398 


8417 1 


4 


SEQ ID NO: 


522 


cagagtccctcaaacagac 


3752 


3771 


SEQ ID NO: 1525 


gtctgtgggattccatctg 


4082 


4101 1 


4 


SEQ ID NO: 


523 


aaattaatagttgcaatga 


3795 


3814 


SEQ ID NO: 1526 


tcataagttcaatgaattt 


13178 


13197 1 


4 


SEQ ID NO: 


524 


ttcaacctccagaacatgg 


3891 


3910 


SEQ ID NO: 1627 


ccattgaccagatgctgaa 


8134 


8153 1 


4 


SEQ ID NO: 


525 


tgggattgccagacttcca 


3907 


3926 


SEQ ID NO: 1528 


tggaaatgggcctgcccca 


8895 


8914 1 


4 


SEQ ID NO: 


526 


cagtttgaaaattgagatt 


3986 


4005 


SEQ ID NO: 1529 


aatcacaactcctccactg 


9533 


9552 1 


4 


SEQ ID NO: 


527 


gaaaattgagattcctttg 


3992 


4011 


SEQ ID NO: 1530 


caaaactaccacacatttc 


13686 


13705 1 


4 
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SEQ ID NO: 528 


tttgccttttggtggcaaa 


4007 


4026 


SEQ ID NO: 1531 


tttgagaggaatcgacaaa 


6351 


6370 


1 


4 


SEQ ID NO: 529 


ctccagagatctaaagatg 


4028 


iA ft^ ft 

4047 


SEQ ID NO: 1532 


catcaattggttacaggag 


7588 


7605 


1 


4 


SEQ ID NO: 530 


tctaaagatgttagagact 


4037 


4056 


SEQ ID NO: 1533 


agtccttcatgtccctaga 


10025 10044 1 


4 


SEQ ID NO: 531 


ctgtgggattccatctgcc 


^ JiX.MK A 

4084 


4103 


SEQ ID NO: 1534 


ggcattttgaaaaaaacag 


9727 


9746 


1 


4 


SEQ ID NO: 532 


atctgccatctcgagagtt 


4096 


4115 


SEQ ID NO: 1535 


aactctcaaaccctaagat 


8548 


8567 


1 


4 


SEQ ID NO: 533 


tctcgagagttccaagtoc 


4104 


4123 


SEQ ID NO: 1636 


ggacattcctctagcgaga 


8207 


8226 


1 


4 


SEQ ID NO: 534 


agtccctacttttaccatt 


4118 


4137 


SEQ ID NO: 1537 


aatgaatacagccaggact 


6078 


6097 


1 


4 


SEQ ID NO: 535 


acttttaccattcccaagt 


4125 


4144 


SEQ ID NO: 1538 


actttgtagaaatgaaagt 


8101 


8120 


1 


4 


SEQ ID NO: 536 


MM i i i J 

cattcccaagttgtatcaa 


4133 


4162 


SEQ ID NO: 1539 


ttgaaggacttcaggaatg 


12001 


12020 1 


4 


SEQ ID NO: 537 


accacatgaaggctgactc 


4276 


4295 


SEQ ID NO: 1540 


gagtaaaccaaaacttggt 


9016 


9035 


1 


4 


SEQ ID NO: 538 


tttoctacaatgtgcaagg 


4309 


4328 


SEQ ID NO: 1541 


cctttaacaattccfgaaa 


A if A ^ 

9495 


9514 


1 


4 


SEQ ID NO: 539 


ctggagaaacaacatatga 


4330 


4349 


SEQ ID NO: 1542 


tcattctgggtctttccag 


11027 


11046 1 


4 


SEQ ID NO: 540 


^ A ^ ^ A...^X. X XXX 

atcatgtgatgggtctcta 


4370 


4389 


SEQ ID NO: 1543 


tagaattacagaaaatgat 


c B^ 
6557 


6576 


1 


4 


SEQ ID NO: 541 


catgtgatgggtctctacg 


4372 


4391 


SEQ ID NO: 1544 


cgtaggcaccgtgggcatg 


Oft 

12125 


12144 1 


4 


SEQ ID NO: 542 


XX. A XX ft ft 

ttctagattcgaatatcaa 


4399 


4418 


SEQ ID NO: 1545 


ttgatgatgctgtcaagaa 


^ A A A 

7300 


7319 


1 


4 


SEQ ID NO: 543 


ft X ft ft. 

tggggaccacagatgtctg 


4491 


4510 


SEQ ID NO: 1546 


cagaattccagcttcccca 


n A A^ 

8326 


8345 


1 


4 


SEQ ID NO: 544 


ctaacactggccggctcaa 


4636 


4655 


SEQ ID NO: 1547 


ttgaggctattgatgttag 


6976 


6995 


1 


4 


SEQ ID NO; 545 


XX XX 

taacactggccggctcaat 


4637 


4656 


SEQ ID NO: 1548 


attgaggctattgatgtta 


6975 


6994 


1 


4 


SEQ ID NO: 546 


aacactggccggctcaa^ 


4638 


4657 


SEQ ID NO: 1549 


cattgaggctattgatgtt 


0n^ft 

6974 


6993 


1 


4 


SEQ ID NO: 547 


X X X 

ctggccggctcaatggaga 


^ ft ft"ft 

4642 


4661 


SEQ ID NO: 1550 


tctccatctgcgctaccag 


12065 


12084 1 


4 


SEQ ID NO: 548 


agataacaggaagatatga 


4705 


4724 


SEQ ID NO: 1551 


tcatctcctttcttcatct 


10202 


10221 


1 


4 


SEQ ID NO: 549 


tccctcacctccacctctg 


4737 


4756 


SEQ ID NO: 1552 


cagatatatatctcaggga 


8176 


8195 


1 


4 


SEQ ID NO: 550 


agctgactttaaaatctga 


4810 


4829 


SEQ ID NO: 1553 


tcaggctcttcagaaagct 


7922 


7941 


1 


4 


SEQ ID NO: 551 


ctgactttaaaatctgaca 


ft A ft A 

4812 


A tf%ft% ft 

4831 


ft ft ft i^^ id mm M 

SEQ ID NO: 1554 


tgtcaagataaacaatcag 


8732 


8751 


1 


4 


SEQ ID NO: 552 


. _ A ft ft ft A 

caagatggatatgaccttc 


4665 


4884 


SEQ ID NO: 1555 


gaagtagtactgcatcttg 


6835 


6854 


1 


4 


SEQ ID NO: 553 


gctgcgttctgaatatcag 


4901 


4920 


SEQ ID NO: 1556 


ctgagtcccagtgcccagc 


9342 


9361 


1 


4 


SEQ ID NO: 554 


cgttctgaatatcaggctg 


4905 


4924 


SEQ ID NO: 1557 


cagcaagtacctgagaacg 


8603 


8622 


1 


4 


SEQ ID NO: 555 


aattcccatggtcttgagt 


4968 


4987 


SEQ ID NO: 1558 


actcagatcaaagttaatt 


12264 


12283 1 


4 


SEQ ID NO: 556 


» ^ A XX IX ft • 

tggtcttgagttaaatgct 


id tf^«Mft« 

4976 


4995 


SEQ ID NO: 1559 


agcacagtacgaaaaacca 


J A A A ^ 

10801 


10820 1 


4 


SEQ ID NO: 557 


XX XX X J. 

cttgagttaaatgctgaca 


4980 


4999 


SEQ ID NO: 1560 


tgtccctagaaatctcaag 


10034 


10053 1 


4 


SEQ ID NO: 556 


XX XX 1ft ■ 

ttgagttaaatgctgacat 


4981 


5000 


SEQ ID NO: 1561 


atgtccctagaaatctcaa 


10033 


10052 1 


4 


SEQ ID NO: 559 


tgagttaaatgctgacatc 


4982 


5001 


SEQ ID NO: 1562 


gatggaaccctctccctca 


4726 


4744 


1 


4 


SEQ ID NO: 560 


acttgaagtgtagtctcct 


5086 


5105 


SEQ ID NO: 1563 


aggaaactcagatcaaagt 


12259 


12278 1 


4 


SEQ ID NO: 561 


agtgtagtctcctggtgct 


5092 


5111 


SEQ ID NO: 1564 


agcagccagtggcaccact 


12506 


12525 1 


4 


SEQ ID NO: 562 


gtgctggagaatgagctga 


5106 


5125 


SEQ ID NO: 1565 


tcagccaggtttatagcac 


7726 


7745 


1 


4 


SEQ ID NO: 563 


ctggggcatctatgaaatt 


5143 


5162 


SEQ ID NO: 1566 


aatttctgattaccaccag 


13571 


13590 1 


4 


SEQ ID NO: 564 


atggccgcttcagggaaca 


5170 


5189 


SEQ ID NO: 1567 


tgttttttggaaatgccat 


8641 


8660 


1 


4 


SEQ ID NO: 565 


ttcagtctggatgggaaag 


5199 


5218 


SEQ ID NO: 1568 


ctttgacaggcattttgaa 


9719 


9738 


1 


4 
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ocU ILJ NO: 5oO 


ccatgattctgggtgtcga 


5257 


5276 


ocU \u inU. oo7 


aaaacattttcaacttcaa 


5281 


5300 


OCU lU NO. OQO 


cttaagctctcaaatgaca 


5316 


5335 


ocU lU NU. 009 


ttaagctcicaaatgacat 


5317 


5336 


OCU lU NU. 070 


catg atg g 9 ctcatatgct 


5333 


5352 


OtU lU NO: 571 


tgggctcatatgctgaaat 


5336 


5357 


bcU ID NO: 572 


actggacttctcttcaaaa 


5399 


5418 


ocW lU NO. O/o 


acttctcttcaaaacttga 


5404 


5423 


QCW lU NO. OfH 


ctgacaagttttataagca 


5437 


5456 


CCD in hir\' ere. 
OCU lU NU. 0/0 


aagttttataagcaaactg 


5442 


5461 


oLU lU NO; o7o 


ctgttaatttacagctaca 


5458 


5477 


ocW lU NU, 0/ / 


ttacagctacagccctatt 


6466 


5485 


ocU lU NU. o7o 


tctggtaactactttaaac 


5486 


5505 


ocU lU NO. 579 


tttaaacagtgacctgaaa 


5498 


5517 


ocU ilJ NO: 580 


ttaaacagtgacctgaaat 


5499 


5518 


ocU \u NO: 5o1 


cagtgacctgaaatacaat 


5504 


5523 


Qcr^ in Kirt< Ran 
ocU lU NO. Oo2 


tgtggctggtaacctaaaa 


5576 


5595 


QCTA in MO- CfiQ 
OCW lU NO. OOO 


ttatcagcaagctataaag 


5649 


5668 


CPn in MO' cnyi 

lU INU. Oo*t 


ggttcagggtgtggagttt 


5684 


5703 


OCW lU INU. ODO 


attcagacicactgcattt 


5767 


5786 


OCW iU INU. 000 


ucagactcactgcatttc 


5768 


5787 


OCW IL/ INU. oor 


tacaaatggcaatgggaaa 


5840 


A A 

5859 


5^PO in MO* f^ftn 
OCW lU INU. OOO 


gctgtatagcaaattcctg 


Ann 

5888 


5907 


QPn in hid' cLftQ 
OCW lu INU, ooy 


tgagcagacaggcacctgg 


6035 


6054 


Qpn in Mrt* KQft 

OCW lU INU. O9U 


ggcacctggaaaotcaaga 


6045 


6064 


OCU iU NU. O9I 


tgaatacagccaggacttg 


6080 


6099 


Qco in Kio> coo 
OCW IU NU. 092 


gaatacagccaggacttgg 


6081 


6100 


ocQ ID NO: 593 


ctggacgaactctggctga 


6139 


6158 


ocU ID NO: 594 


ttttactcagtgagcccat 


6193 


6212 


Qco in MO* coc 
OCW IU NU. Of^O 


gatgagagatgccgttgag 


6233 


6252 


QPO in MO> COc 
OCW IU NU. 09o 


aattgttgcttttgtaaag 


6269 


6288 


SEQ ID NO: 597 


cttttgtaaagtatgataa 


6277 


6296 


8EQ ID NO: 598 


tttgtaaagtatgataaaa 


6279 


6298 


SEQ ID NO: 599 


tccattaacctcccatttt 


6312 


6331 


SEQ ID NO: 600 


ccattaacctcccattttt 


6313 


6332 


SEQ ID NO: 601 


cttgcaagaatatUtgag 


6338 


6357 


SEQ ID NO: 602 


agaatattttgagaggaat 


6344 


6363 


SEQ ID NO; 603 


attatagttgtactggaaa 


6372 


6391 



SEQ ID NO: 1669 


tcgatgcacatacaaatgg 


5830 


5849 1 


4 


SEQ ID NO: 1570 


ttgatgttagagtgctttt 


6985 


7004 1 


4 


SEQ ID NO: 1571 


tgtcctacaacaagtfaag 


7247 


7266 1 


4 


SEQ ID NO: 1572 


atgtcctacaacaagttaa 


7246 


7265 1 


4 


SEQ ID NO; 1573 


agcatctttggctcacatg 


7616 


7635 1 


4 


SEQ ID NO: 1574 


atttatcaaaagaagccca 


12934 12953 1 


4 


SEQ ID NO: 1575 


ttttggcaagctatacagt 


8372 


8391 1 


4 


SEQ ID NO: 1576 


tcaattgggagagacaagt 


6496 


6615 1 


4 


SEQ ID NO: 1677 


tgctttgtgagtttatcag 


9685 


9704 1 


4 


SEQ ID NO: 1578 


cagtcatgtagaaaaactt 


4421 


4440 1 


4 


SEQ ID NO: 1579 


tgtactggaaaacgtacag 


6380 


6399 1 


4 


SEQ ID NO: 1580 


aatattgatcaatttgtaa 


6417 


6436 1 


4 


SEQ ID NO: 1581 


gtttgaaaaacaaagcaga 


11812 


11831 1 


4 


SEQ ID NO: 1582 


tttcatttgaaagaataaa 


7024 


7043 1 


4 


SEQ ID NO: 1583 


atttcaagcaagaacttaa 


10426 


10445 1 


4 


SEQ ID NO: 1564 


attggcgtggagcttactg 


6123 


6142 1 


4 


SEQ ID NO: 1585 


ttttgctggagaagccaca 


10757 10776 1 


4 


SEQ ID NO: 1586 


ctttgcactatgttcataa 


12756 12775 1 


4 


SEQ ID NO: 1587 


aaacacctaagagtaaacc 


9006 


9025 1 


4 


SEQ ID NO: 1588 


aaatgctgacatagggaat 


8429 


8448 1 


4 


SEQ ID NO: 1589 


gaaatattatgaacttgaa 


13304 13323 1 


4 


SEQ ID NO; 1590 


tttcctaaagctggatgta 


11168 


11187 1 


4 


SEQ ID NO: 1591 


caggtccatgcaagtoagc 


10911 


10930 1 


4 


SEQ ID NO; 1592 


ccagcttccccacatctca 


8333 


8352 1 


4 


SEQ ID NO: 1593 


tcttcgtgtttcaactgcc 


11213 


11232 1 


4 


SEQ ID NO: 1594 


caagtaagtgctaggttca 


9372 


9391 1 


4 


SEQ ID NO: 1595 


ccaacacttacttgaattc 


10660 


10679 1 


4 


SEQ ID NO: 1596 


tcagaaagctaccttccag 


7931 


7950 1 


4 


SEQ ID NO: 1597 


atggacttcttctggaaaa 


8870 


8889 1 


4 


SEQ ID NO: 1598 


ctcatctcctttcttcatc 


10201 


10220 1 


4 


SEQ ID NO: 1599 


cttttctaaacttgaaatt 


9056 


9075 1 


4 


SEQ ID NO: 1600 


ttatgaacttgaagaaaag 


13310 


13329 1 


4 


SEQ ID NO: 1601 


ttttcacattagatgcaaa 


8413 


8432 1 


4 


SEQ ID NO: 1602 


aaaattgatgatatctgga 


10719 10738 1 


4 


SEQ ID NO: 1603 


aaaagggtcatggaaatgg 


8885 


8904 1 


4 


SEQ ID NO: 1604 


ctcaattttgattttcaag 


8520 


8539 1 


4 


SEQ ID NO: 1605 


attccctccattaagttct 


11700 


11719 1 


4 


SEQ ID NO: 1606 


tttcaagcaagaacttaat 


10427 10446 1 


4 
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SEQ ID NO: 604 


gaagcacatcaatattgat 


6407 


6426 


SEQ ID NO: 605 


acatcaatattgatcaatt 


6412 


6431 


SEQ ID NO: 606 


gaaaactcccacagcaagc 6457 


6476 


SEQ ID NO: 607 


ctgaattcattcaattggg 


6486 


6505 


SEQ ID NO: 608 


tgaattcattcaattggga 


6487 


6506 


SEQ ID NO: 609 


aactgactgctctcacaaa 


6532 


6551 


SEQ ID NO: 610 


aaaagtatagaattacaga 


6550 


6569 


SEQ ID NO: 611 


atcaactttaatgaaaaac 


6603 


6622 


SEQ ID NO: 612 


tgatttgaaaatagctatt 


6686 


6705 


SEQ ID NO: 613 


atttgaaaatagctattgc 


6688 


6707 


SEQ ID NO: 614 


attgctaatattattgatg 


6702 


6721 


SEQ ID NO: 615 


gaaaaattaaaaagtcttg 


6729 


6748 


SEQ ID NO: 616 


actatcatatccgtgtaat 


6754 


6773 


SEQ ID NO: 617 


tattgattttaacaaaagt 


6815 


6834 


SEQ ID NO: 618 


ctgcagcagcttaagagac 


6906 


6925 


SEQ ID NO: 619 


aaaacaacacattgaggct 


6965 


6984 


SEQ ID NO: 620- 


ttgagcatgtcaaacactt 


7061 


7070 


SEQ ID NO: 621 


tttgaagtagctgagaaaa 


7092 


7111 


SEQ ID NO: 622 


ttagtagagttggcccacc 


7191 


7210 


SEQ ID NO: 623 


tgaaggagactattcagaa 


7219 


7238 


SEQ ID NO: 624 


gagactattcagaagctaa 


7224 


7243 


SEQ ID NO: 625 


aattagttggatttattga 


7285 


7304 


SEQ ID NO: 626 


gcttaatgaattatctttt 


7319 


7338 


SEQ ID NO: 627 


ttaacaaattccttgacat 


7357 


7376 


SEQ ID NO: 628 


aaattaaagtcatttgatt 


7386 


7405 


SEQ ID NO: 629 


gactcaatggtgaaattca 


7456 


7475 


SEQ ID NO: 630 


gaaattcaggctctggaac 


7487 


7486 


SEQ ID NO: 631 


actaccacaaaaagctgaa 


7484 


7503 


SEQ ID NO: 632 


ccaaaataaccttaatcat 


7570 


7689 


SEQ ID NO: 633 


aaataaccttaatcatcaa 


7573 


7692 


SEQ ID NO: 634 


tttaagttcagcatctttg 


7607 


7626 


SEQ ID NO: 835 


caggtttatagcacacttg 


7731 


7750 


SEQ ID NO: 636 


gttcactgttcctgaaatc 


7862 


7881 


SEQ ID NO: 637 


cactgttcctgaaatcaag 


7865 


7884 


SEQ ID NO: 638 


actgttcctgaaatcaaga 


7866 


7885 


SEQ ID NO: 639 


gcctgccfttgaagtcagt 


7901 


7920 


SEQ ID NO: 640 


taacagatttgaggattcc 


7972 


7991 


SEQ ID NO: 641 


gttttccacaccagaattt 


8042 


8061 



SEQ ID NO: 1607 


atcagttcaqataaacttc 


7991 


8010 1 


4 


SEQ ID NO: 1608 


aattccctgaagttqatQt 


11479 11498 1 


4 


SEQ ID NO: 1609 


gctttctcttccacattto 


10052 10071 1 


4 


SEQ ID NO; 1610 


cccatttacaaatcttcaa 


11363 11382 1 


4 


SEQ ID NO: 1611 


tcccatttacaaatcttca 


11382 11381 1 


4 


SEQ ID NO: 1612 


tttgaggattccatcaatt 


7979 


7998 1 


A 
*r 


SEQ ID NO: 1613 


tctgsctccctcaactttt 


9042 


9061 1 


A 
H 


SEQ ID NO: 1614 


stttattaaaaatattaat 


6803 


6822 1 


A 


SEQ ID NO: 1615 


aatattattqata aaatca 


6708 


6727 1 


A 


SEQ ID NO: 1616 


gcaaciaacttaataaaaat 


10433 10452 1 


A 
H 


SEQ ID NO: 1617 


catcacactaaataccaat 


10161 


10170 1 


A 
«f 


SEQ ID NO: 1618 


caaaaQcttataaaattte 


11153 11172 1 


A 

*r 


SEQ ID NO: 1619 


attactttgaaaaattaat 


7273 


7292 1 


A 
•t 


SEQ ID NO: 1620 


acttaacttcaaaaaaata 


11396 


11415 1 


A 
H 


SEQ ID NO: 1621 


gtcttcagtgaaactacaa 


10691 


10710 1 


A 


SEQ ID NO: 1622 


agcctcacctcttactttt 


10563 10582 1 


4 


SEQ ID NO: 1623 


aagtagctgagaaaatcaa 


7096 


7115 1 


4 


SEQ ID NO: 1624 


ttttcacattagatqcaaa 


8413 


8432 1 


4 


SEQ ID NO: 1625 


ggtggactcttqctgctaa 


7768 


7787 1 


4 

*T 


SEQ ID NO: 1626 


ttctcaattttgattttca 


8518 


8537 1 


4 


SEQ ID NO: 1627 


ttagccacagctctatctc 


10293 


10312 1 


A 


SEQ ID NO: 1628 


tcaaqaaqcttaataaatt 


7312 


7331 1 


4 


SEQ ID NO: 1629 


aaaacQaocttcaaaaaac 


13201 


13220 1 


A 
*# 


SEQ ID NO: 1630 


atgtcctacaacaaattaa 


7246 


7265 1 




SEQ ID NO: 1631 


aatcctttqacaaacattt 


9716 


9734 1 




SEQ ID NO: 1632 


tgaaattcaatcacaaatc 


9068 


9087 1 


A 


SEQ ID NO: 1633 


gttctcaattttqattttc 


8517 


8536 1 




SEQ ID NO: 1634 


ttcaggaactattgctaat 


10637 


10656 1 


4 


SEQ ID NO: 1635 


atgatttccctqaccttaa 


10942 


10961 1 


4 


SEQ ID NO: 1636 


ttgaagtaaaagaaaattt 


10741 


10760 1 


4 


SEQ ID NO: 1637 


caaatctqqatttcttaaa 


9472 


9491 1 


4 


SEQ ID NO- 1638 


uaayy y llUawiynuCig 


7857 


7876 1 


4 


SEQ ID NO: 1639 


gattcteagatgagggaac 


8914 


8933 1 


4 


SEQ ID NO: 1640 


cttgaacacaaagtcagtg 


6000 


6019 1 


4 


SEQ ID NO: 1641 


tcttgaacacaaagtcagt 


5999 


6018 1 


4 


SEQ ID NO: 1642 


actgttgactcaggaaggc 


12572 


12591 1 


4 


SEQ ID NO: 1643 


ggaagcttctcaagagtta 


13214 


13233 1 


4 


SEQ ID NO: 1644 


aaatttctctgctggaaac 


9410 


9429 1 


4 
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SEQ ID NO: 642 


tcagaaccattaaccaaat 


8128 


8147 


8EQ ID NO: 643 


tacjcoaa aatcaccctacc 


8218 


89*^7 

w&wr 


SEQ ID NO: 644 


ccttaataattttcaaatt 


8231 


8^in 

OO lU 


SEQ ID NO: 645 


acataccaaaattccaact 




OOOo 


SEQ ID NO: 646 


aatactaacataaaciaata 




RAAO 


SEQ ID NO: 647 


stQciaacsisaGasiRiaa 




OhDU 


SEQ ID NO: 648 


aaccacctoaoioaaacaas 






SEQ ID NO: 649 


aacaoatatcacaacttcr 

wy vsiy y m iwy way wiLww 


846R 


HAR7 


SEQ ID NO: 650 


tQCacaactctcaaaccct 


8543 


8'^89 


SEQ ID NO: 651 


aQQaatcaataaaattctc 

8p^y *^^y ^y ^*"y ••v 


8584 


8^0*% 

Owww 


SEQ ID NO: 652 


tttttaaaaatQccattaa 


8644 


8883 

OOww 


SEQ ID NO: 653 


aatciaaataattatcaaaa 


8721 


8740 


SEQ ID NO: 654 


QtcadQataaacaatcaac 


8733 


Of W6 


SEQ ID NO: 655 


tccacaaattoaacatccc 


8779 


w / wO 


SEQ ID NO: 656 


ttaaacatccccaaactaci 


8787 


8808 


SEQ ID NO: 657 


acatccccaaactaaactt 


8791 


8810 

DO lu 


SEQ ID NO: 658 


acttctctaatcaaa eta a 


8806 

www 


8825 


SEQ ID NO: 659 


tsaatcacaaattaatttc 


8936 

www V 


8355 

wwww 


SEQ ID NO: 660 


agaaggacccctcacttcc 


8960 

WW WW 


8979 

WW ( W 


SEQ ID NO: 661 


ttggactgtccaataaaat 


8980 

WW WW 


8999 

wwww 


SEQ ID NO: 662 


actatccaataaaatcaat 


8984 

Www" 


9003 


SEQ ID NO: 663 


ctotccaataaaatcaafa 


8985 

wwVw 


Q004 


SEQ ID NO: 664 


atttatoaatctacictccn 

y « B hm by d M fcy M w & w WW 


WWWW 




SEQ ID NO: 665 


atoaatctdactccctcaa 


Qn37 

www r 




SEQ ID NO: 666 


ctcaacttttctaaacHn 


Qn<^i 

wU0 1 


0070 


SEQ ID NO: 667 


ctaaaaacataacactatt 


0121 

w 1^ 1 


Q140 

w I*tw 


SEQ ID NO: 668 


aaggcatggcactgtttgg 


9124 


w IMtJ 


SEQ ID NO: 669 


atccacaaacaatgaaggg 


9254 


0973 


SEQ ID NO: 670 


ggaatttgaaagttcgttt 


9271 




SEQ ID NO: 671 


aataactatgcactgtttc 


9324 


3343 


SEQ ID NO: 672 


gaaacaacgagaacattat 


9424 


9443 


SEQ ID NO- 673 


ttcttgaaaacgacaaagc 


9591 


yoiu 


SEQ ID NO: 674 


ataagaaaaacaaacacag 9640 


9659 


SEQ ID NO: 675 


aaaacaaacacaggcattc 


9646 


9665 


SEQ ID NO: 676 


gcattccatcacaaatcct 


9659 


9678 


SEQ ID NO: 677 


tttgaaaaaaacagaaaca 


9732 


9751 


SEQ ID NO: 678 


caatgcattagattttgtc 


9749 


9768 


SEQ ID NO: 679 


caaagctgaaaaatctcag 


9809 


9828 



SEQ ID NO: 1645 
SEQ ID NO: 1646 
SEQ ID NO: 1647 
SEQ ID NO: 1648 
SEQ ID NO: 1649 
SEQ ID NO: 1650 
SEQ ID NO: 1651 
SEQ ID NO: 1652 
SEQ ID NO: 1653 
SEQ ID NO: 1654 
SEQ ID NO: 1655 
SEQ ID NO: 1656 
SEQ ID NO: 1657 
SEQ ID NO: 1658 
SEQ ID NO: 1659 
SEQ ID NO: 1660 
SEQ ID NO: 1661 
SEQ ID NO: 1662 
SEQ ID NO: 1663 
SEQ ID NO: 1664 
SEQ ID NO: 1665 
SEQ ID NO: 1666 
SEQ ID NO: 1667 
SEQ ID NO: 1668 
SEQ ID NO: 1669 
SEQ ID NO: 1670 
SEQ ID NO: 1671 
SEQ ID NO: 1672 
SEQ ID NO: 1673 
SEQ ID NO: 1674 
SEQ ID NO: 1675 
SEQ ID NO: 1676 
SEQ ID NO: 1677 
SEQ ID NO: 1678 
SEQ ID NO: 1679 
SEQ ID NO: 1680 
SEQ ID NO: 1681 
SEQ ID NO: 1682 



atctgcagaacaatgctga 

ggcagcttctggcttgcta 

aactgttgactcaggaagg 

agctgccagtccttcatgt 

cattaatcctgccatcatt 

ccatttgagatcacggcat 

ttcgttttccattaaggtt 

ggaagtggccctgaatgct 

agggaaagagaagattgca 

gagaacttactatcatcct 

tcaatgaatttattcaaaa 

tcttttcagcccagccatt 

gctgactttaaaatctgac 

gggatttcctaaagctgga 

ccagtttccagggactcaa 

aagtcgattcccagcatgt 

tcagatggaaaaatgaagt 

gaaagtccataatggttca 

ggaagaagaggcagcttct 

atctaaatgcagtagccaa 

attgataaaaccatacagt 

tattgataaaaccatacag 

gggaatcigatgaggaaac 

ttgagttgcccaccatcat 

caagatcgcagactttgag 

aacagaaacaatgcattag 

ccaagaaaaggcacacctt 

ccctaacagatttgaggat 

aaacaaacacaggcattcc 

gaaatactgttttcctatt 

ataaactgcaagatttttc 

gctttccaatgaccaagaa 

ctgtgctttgtgagtttat 

gaatttgaaagttcgtttt 

aggaagtggccctgaatgc 

tgttgaaagatttatcaaa 

gacaagaaaaaggggattg 

ctgagaacticatcatttg 



12430 12449 1 4 
12293 12312 1 4 
12571 12590 1 4 
10018 10037 1 4 
9997 10016 1 4 
9237 9256 1 4 
9283 9302 1 4 
10964 10983 1 4 
13493 13512 1 4 
13780 13799 1 4 
13186 13205 1 4 
9223 9242 1 4 
4811 4830 1 4 
11164 11183 1 4 
12595 12614 1 4 
9082 9101 1 4 
11002 11021 1 4 
12809 12828 1 4 
12284 12303 1 4 
11626 11646 1 4 
13883 13902 1 4 
13882 13901 1 4 
12247 12266 1 4 
11659 11678 1 4 
11645 11664 1 4 
9741 9760 1 4 
11069 11088 1 4 
7969 7988 1 4 
9647 9666 1 4 
12828 12847 1 4 
13600 13619 1 4 
11057 11076 1 4 
9662 9701 1 4 
9272 9291 1 4 
10963 10982 1 4 
12926 12944 1 4 
10271 10290 1 4 
11430 11449 1 4 
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cciggatacacignccag 


9oo5 


9874 


SEO ID NO- 681 




QQflO 


9t^U1 


SEO ID NO' Rft9 


f ^tor^of r*/^f o n n f 
lUwlwCalwulagyuCl 


9990 


QD7CI 
99/0 


SEO ID NO* Rfi'^ 


UuLwCalCCldyyUC/ig 


99w r 


99 fQ 


SEO ID NO* 884 


lUa lUayaywiy Uudg lUu 


lUU 1 i 


Anrvxfs 
1UU0U 


SEO ID NO' 685 


tn ftna 9 f*ttf H9 9 ^ n 
vri^clcl V/ IL 11 Ida V 


lU Iww 


lu loo 


SEO ID NO* R8R 


UlClHuwllCalCllCal 




A ft*5'5C 


SEO ID NO' fi87 


to t/^9tf n n f>9/%t/<i/« a ri 
igiwaltg algCawig va9 




I (\nAtz 


SEO in NO* R8ft 


rO 9Trtr»Q^trt 0 "a 

igaigcacigcagiacaaa 




A noc<<i 


SEO ID NO* 689 


ay wlwig Iw Lwlgay Caaw 


lUwU 1 




SEO ID NO' RQO 


cig CwgciaaUvCaauug 


iu*tuu 


4 A/l H A 

1U419 


SEO ID NO* RQ1 


ftn 9 n 4 9tn94tHna 

ugagaaigaauicaagc 




A f\A1C 


SFO in NO' RQ9 


aaacciacigiCTCitcct 


1U4o1 


A AM OA 


SEO ID NO- RQ'^ 


f 9r*fttt^f* Q ttrt 0 ^fr 
IdLiLlUOUaligagiCai 


lUD/ w 


4 nCAyl 

1U094 


SEO ID NO* RQ21 


icaggiccaigcaagicag 


IU9IU 




SEQ ID NO- 695 


9for*99nff^9n<^i^r*antt/« 

d wdd y lUdg CwCagilC 


IU9 10 




SEO ID NO* 6SB 


tn 9 9 tnotoo/*a f^fo 0 n a 

igaa ig wiaaCawiEiag aa 


IU9/ 0 


1U994 


SEQ ID NO- 697 


9n99/l9Tr*9n9tf^naooaa 
dyddydlUdgdiggadaao 


IU99O 


A ACiA R 


SEQ ID NO* 698 


n npf a ttf^artpf nn 9f r*o 
yywieiiix/dLiviuudiCv 


1 1^OO 


1 0 


SEQ ID NO* 699 


999ntHtnnr^tn9t9a9f 


1 I^OU 


1 1iC99 


SEQ ID NO* 700 


a n ttttnn r* tn 9 19 9 94f 0 
dy uiiyy uiydiddaUC 




1 IOUI 


SEQ ID NO* 701 


f^fnr(n/^tn999Af999fn9 
^/iyyyvriydddClaaaiga 




4 A Q*57 


SEQ ID NO' 702 


^an ana 9 at9f^999f r^tat 
UdydydddldwdaalCidi 


1 l^wO 


1 n*t^*t 


SEQ ID NO' 703 


nnnnt9999t(^/^^fnaan 
ydyyiddddiiwwwigdcig 


^ 1/17*5 
1 i*t# 4, 


J l*t9l 


SEQ ID NO* 704 


f^ttffttnanafaanf^ntn 
wuiiiiydydiadwwyiy 


i i(«^7 


A A CCQ 
1 lOOD 


SEQ ID NO' 705 


n 0 f nna af tnt/^ 0 ft^^t4 
ywiyyaaugivaUwCu 




A A TAd. 


SFO in NO' 7nR 


gigidiaaigcCdCngya 


•1 A 70*7 


A A OAO 


SEQ ID NO* 707 


atto^a^ atn ^on<^t^ a 0 
dUwCdCaigCagCiCaaC 


J I0DI 


«| A Q*7A 

no/o 


SEO ID NO- 708 


igddgddgdiggCdaani 


A A OQyl 


1 0 AOO 
I2OOO 


^FO in NO' 7na 


aicaaaagcccagcgttca 


12042 


A An0<t 

12061 


SEQ ID NO' 710 


giyggcdiggaidiggaig 


1Z100 


121o4 


cci'^ in hif\' 
OCU lU inU* f 1 1 


aaatggaacttctactaca 


12171 


12190 


SEQ ID NO: 712 


aaaaactcaccatattcaa 


12211 


12230 


SEQ ID NO: 713 


ctgagaagaaatctgcaga 


12420 


12439 


SEQ ID NO: 714 


acaatgctgagtgggttta 


12439 


12458 


SEQ ID NO: 715 


caatgctgagtgggtttat 


12440 


12459 


SEQ ID NO: 716 


ttaggcaaattgatgatat 


12469 


12488 


SEQ ID NO: 717 


ataaactaatagatgtaat 


12889 


12908 



SEQ ID NO: 1683 ctggacttctctagtcagg 8802 8821 1 4 

SEQ ID NO: 1684 tgaatctggctccctcaac 9038 9057 1 4 

SEQ ID NO: 1685 agaatccagatacaagaaa 6885 6904 1 4 

SEQ ID NO: 1686 cagaatccagatacaagaa 6884 6903 1 4 

SEQ ID NO: 1687 ggacagtgaaatattatga 13297 13316 1 4 

SEQ I D NO: 1 686 ctggatgtaaccaccagca 1 1 1 78 1 1 1 97 1 4 

SEQ ID NO: 1689 atgaagcttgctccaggag 13764 13783 1 4 

SEQ ID NO: 1690 ctgcgctaccagaaagaca 12072 12091 1 4 

SEQ ID NO: 1691 tttgagttgcccaccatca 11658 11677 1 4 

SEQ ID NO: 1692 gltgaccacaagcttagct 10539 10558 1 4 

SEQ ID NO: 1693 caaagctggcaccagggct 13963 13982 1 4 

SEQ ID NO: 1694 gcttcaggaagcttctcaa 13208 13227 1 4 

SEQ ID NO: 1695 aggaaggccaagccagttt 12583 12602 1 4 

SEQ ID NO: 1696 atgattatgtcaacaagla 12355 12374 1 4 

SEQ ID NO: 1697 ctgacatcttaggcactga 4993 5012 1 4 

SEQ ID NO: 1698 gaactcagaaggatggcat 13994 14013 1 4 

SEQ ID NO: 1699 ttctcaattttgattttca 8518 8537 1 4 

SEQ ID NO: 1700 ttttctaaatggaacttct 12165 12184 1 4 

SEQ ID NO: 1701 ggatctaaatgcagtagcc 11624 11643 1 4 

SEQ ID NO: 1702 atttcttaaacattccttt 9481 9500 1 4 

SEQ ID NO: 1703 gaatctggctccctcaact 9039 9058 1 4 

SEQ ID NO: 1704 tcattctgggtctttccag 11027 11046 1 4 

SEQ ID NO: 1705 atagcatggacttcttctg 8865 8884 1 4 

SEQ ID NO: 1706 cttctggcttgctaacctc 12298 12317 1 4 

SEQ ID NO: 1707 cacggagttactgaaaaag 13715 13734 1 4 

SEQ ID NO: 1708 aaggcatctccacctcagc 12094 12113 1 4 

SEQ ID NO: 1709 tccaagatgagatcaacac 13096 13115 1 4 

SEQ ID NO: 1710 gttgagaagcxccaagaat 6246 6265 1 4 

SEQ ID NO: 1711 aaattctcttttcttttca 9212 9231 1 4 

SEQ ID NO: 1712 tgaaagtcaagcatctgat 12661 12680 1 4 

SEQ ID NO: 1713 catccttaacaccttccao 8063 8082 1 4 

SEQ ID NO: 1714 tgtaccataagccatattt 10080 10099 1 4 

SEQ ID NO: 1715 ttgatgttagagtgctttt 6985 7004 1 4 

SEQ ID NO: 1716 tctgcacagaaatattcag 13439 13458 1 4 

SEQ ID NO: 1717 taaatggagtctttattgt 14078 14097 1 4 

SEQ ID NO: 1718 ataaatggagtctttattg 14077 14096 1 4 

SEQ ID NO: 1719 atattgtcagtgcctctaa 13384 13403 1 4 

SEQ ID NO: 1720 attactatgaaaaatttat 13633 13652 1 4 
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SEQ ID NO: 718 


ccaactaatagaagataac 


13031 


13050 


SEQ ID NO: 1721 


gttatntgctaaacngg 


14044 


140o3 1 




SEQ ID NO: 719 


ttaattatatccaagatga 


13087 


13106 


SEQ ID NO: 1722 


tcatcctctaatntuaa 


13792 


loon 1 


4 


SEQ ID NO: 720 


tttaaattgttgaaagaaa 


13143 


A^A 0O 

13162 


SEQ ID NO: 1723 


tttcatttgaaagaataaa 


7rtO>l 

7024 


7043 1 


A 

4 


SEQ ID NO: 721 


aagttcaatgaatttattc 


13182 


13201 


SEQ ID NO: 1724 


gaalaccaatgctgaactt 


A A 4 OA 
10160 


4 A4 TO 4 

10179 1 


4 


SEQ ID NO: 722 


ttgaagaaaagatagtcag 


13318 


13337 


SEQ ID NO: 1725 


ctgagagaagtgtcncaa 


12399 


12418 1 


4 


SEQ ID NO: 723 


acttccattctgaatatat 


13369 


A 0400 

13388 


SEQ ID NO: 1726 


atatciggaaccttgaagt 


A ATOQ 

10729 


i|A*7>IQ H 

10740 1 


4 


SEQ ID NO: 724 


cacagaaatattcaggaat 


13443 


13462 


ocQ ID NO. 1727 


aucccigaagiigaigig 


4 4ylDA 

1 l4oU 


4 4/tQA 4 


il 
4 


SEQ ID NO: 725 


ccattgcgacgaagaaaat 


13552 


4 *iCTA 

13571 


ScQ ID NO: 1720 


mMM^a 44j«m^m MMa^M M 

aittttattcctgccaigg 


10090 


Ar\A A A 1 

10114 1 


A 
4 


SEQ ID NO: 726 


tataaactgcaagattttt 


13599 


13618 


ir\ KIrt. 4 "TOO 

SEQ ID NO. 1729 


aaaattcaaactgcctata 


A Q QCC 
13000 


40QQ>1 4 

1ooo4 1 


il 
4 


SEQ ID NO: 727 


tctg attactatgaaaaat 


13629 


13648 


ocQ ID NO: 1730 


atttgtaagaaaaiacaga 


RAnQ 

O4^o 


044 f 1 


4 


SEQ ID NO: 728 


ggagttactgaaaaagctg 


13718 


A 0"707 

13737 


SEQ ID NO: 1731 


cagcatgcctagtttctcc 


nt\A A 
3944 


OQ£SO A 

99o3 1 


il 
4 


SEQ ID NO: 729 


tgaagcttgctccaggaga 


13765 


13784 


SEQ ID NO: 1732 


tctcctttcttcatcitca 


A AOAC 

10205 


Afy^*^A A 

102Z4 1 


4 


SEQ ID NO: 730 


tgaactggacctgcaccaa 


13947 


13966 


SEQ ID NO: 1733 


ttggtagagcaagggttca 


7848 


7867 1 


4 


SEQ ID NO: 731 


ttgctaaacttgggggagg 


14050 


14069 


SEQ ID NO: 1734 


cctcctacagtggtggcaa 


A OOO 

4222 


yiovi 4 4 

4241 1 


4 


SEQ ID NO: 732 


gattcgaatatcaaattca 


4404 


4423 


SEQ ID NO: 1735 


tgaaaacgacaaagcaatc 


9595 


ybl4 O 


o 
3 


SEQ ID NO: 733 


atttgtttgtcaaagaagt 


4543 


A COO 

4562 


Od~» ir^ Ki/^> ii*7oe 

ocQ ID NO: 1736 


acttttctaaacugaaat 


9000 


90/4 O 


o 


SEQ ID NO: 734 


tctcggttgctgccgctga 


25 


il >i 
44 


SEQ ID NO: 1737 


tcagcccagccatttgaga 


9226 




Q 

o 


SEQ ID NO: 735 


gctgaggagcccgcccagc 


39 


CO 

58 


ScQ ID NO: 1736 


gctggatgiaaccaccagc 


M 4 4 


4 4 4 AC O 

iiiyo ^ 


o 
i3 


SEQ ID NO: 73o 


ctggtctgtccaaaagatg 


219 




ocU lU NU. 1 /09 


catcagaaccaugaccag 


01^0 


01*tO £. 


O 


SEQ ID NO: 737 


ctgagagttccagtggagt 


283 


302 


ScQ ID NO. 1740 


actcaatggigaaancag 




74 Z 


3 


SEQ ID NO: 736 


cagtgcaccctgaaagagg 


O A^ 

396 


il «l c 

415 


ScQ ID NO: 1741 


cctcacttcctuggactg 


o9q9 


QOfiQ O 
OVOQ £. 


o 
3 


SEQ ID NO: 739 


ctctgaggagtttgctgca 


464 


il o o 

483 


SEQ ID NO: 1742 


tgcaaacttgacttcagag 




11410 £. 


Q 
J 


SEQ ID NO: 740 


acatcaagaggggcatcat 


574 


593 


SEQ ID NO: 1743 


atgacgttcttgagcatgt 


704^ 


/ODl ^ 


o 
O 


SEQ ID NO: 741 


ctgatcagcagcagccagt 


822 


O il it 

841 


SEQ ID NO: 1744 


actggacttctctagtcag 


ooO 1 


D020 £. 


3 


SEQ ID NO: 742 


ggacgctaagaggaagcat 


857 


876 


SEQ ID NO: 1745 


atgcctacgttccatgtcc 


11346 


A A OCC O 

11300 2 


3 


SEQ ID NO: 743 


agctgttttgaagactctc 


1079 


1098 


SEQ ID NO: 1746 


gagaagtgtcttcaaagct 


A 0>l AO 

12403 


12422 2 


o 

3 


SEQ ID NO: 744 


tgaaaaaactaaccatctc 


1105 


A A*t A 

1124 


SEQ ID NO: 1747 


gagatcaacacaatcttca 


4 o«m>i 
13104 


131^3 ^ 


o 


SEQ ID NO: 745 


ctgagctgagaggcctcag 


1168 


1187 


A^^N in 4 7 il O 

SEQ ID NO: 1748 


ctgaattactgcacctcag 


O AO*7 

3027 


3046 2 


o 
3 


SEQ ID NO: 746 


tgaaacgtgtgcatgccaa 


1303 


1322 


SEQ ID NO: 1749 


ttggtagagcaagggttca 


TO A O 

7848 


7oo7 2 


o 
3 


SEQ ID NO: 747 


cdtgtatgcgctgagcca 


A A nn 

1432 


A AC A 

1451 


SEQ ID NO: 1750 


X ^M.M .MM MMM ^_|| MM MM 

tggcactgtttggagaagg 


04 on 
9130 


n4 4 n o 

9149 2 


o 

3 


SEQ ID NO: 748 


aggagctgctggacattgc 


1492 


4 CA A 

1511 


ScQ ID NO. 1751 


gcaagicagcccagncci 


4 AQon 


loyoy z 


o 
0 


SEQ ID NO: 749 


atttgattctgcgggtcat 


1567 


1586 


SEQ ID NO: 1752 


atgaaaccaatgacaaaat 


7420 


7439 2 


3 


SEQ ID NO: 750 


tccagaactcaagtcttoa 


1619 


1638 


SEQ ID NO: 1753 


tgaaatacaatgctctgga 


5512 


5531 2 


3 


SEQ ID NO: 751 


ggttcttcttcagactttc 


1736 


1755 


SEQ ID NO: 1754 


gaaataccaagtcaaaacc 


10447 


10466 2 


3 


SEQ ID NO: 762 


gttgatgaggagtccttca 


1802 


1821 


SEQ ID NO: 1755 


tgaaaaagctgcaatcaac 


13728 


13745 2 


3 


SEQ ID NO: 753 


tccaagatctgaaaaagtt 


1933 


1952 


SEQ ID NO: 1756 


aactgcttctccaaatgga 


3544 


3563 2 


3 


SEQ ID NO: 754 


agttagtgaaagaagttct 


1948 


1967 


SEQ ID NO: 1757 


agaattcataatcccaact 


8267 


8286 2 


3 


SEQ ID NO: 755 


gaagggaatcttatatttg 


2076 


2095 


SEQ ID NO: 1758 


caaaacctactgtctcttc 


10459 


10478 2 


3 
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SEQ ID NO: 756 


ggaagctctttttgggaag 


2213 


2232 


SEQ ID NO: 1759 


cttcacataccagaattcc 


8316 


8335 2 


3 


SEQ ID NO: 757 


tggaataatgctcagtgtt 


2366 


2365 


SEQ ID NO: 1760 


aacaaacacaggcattcca 


9648 


9667 2 


3 


SEQ ID NO: 758 


gatttgaaatccaaagaag 


2400 


2419 


SEQ ID NO: 1761 


cttcatgtccctagaaatc 


10029 


10048 2 


3 


SEQ ID NO: 759 


tccaaagaagtcccggaag 


2409 


2428 


SEQ ID NO: 1762 


cttcagcctgctttctgga 


4943 


4962 2 


3 


SEQ ID NO: 760 


aggaagggctcaaagaatg 


2582 


2581 


SEQ ID NO: 1763 


cattagagctgccagtcct 


10012 


10031 2 


3 


SEQ ID NO: 761 


agaatgactfttttcttca 


2575 


2594 


SEQ ID NO: 1764 


tgaagatgacgacttttct 


12152 


12171 2 


3 


SEQ ID NO: 762 


tttgtgacaaatatgggca 


2757 


2776 


SEQ ID NO: 1765 


tgccagtttgaaaaacaaa 


11807 


11826 2 


3 


SEQ ID NO: 763 


ctgaggctaccatgacatt 


3244 


3263 


SEQ ID NO: 1766 


aatgtcagctcttgttcag 


10895 


10914 2 


3 


SEQ ID NO: 764 


gtagataccaaaaaaatga 


3660 


3679 


SEQ ID NO: 1767 


tcatttgccctcaacctac 


11442 


11481 2 


3 


SEQ ID NO: 765 


aaatgacttccaatttccc 


3673 


3692 


SEQ ID NO: 1768 


gggaactgttgaaagattt 


12919 


12938 2 


3 


SEQ ID NO: 766 


atgacttccaatttccctg 


3675 


3694 


SEQ ID NO: 1769 


caggagaacttactatcat 


13777 


13796 2 


3 


SEQ ID NO: 767 


atctgccatctcgagagtt 


4096 


4115 


SEQ ID NO: 1770 


aactcctccactgaaagat 


9539 


9558 2 


3 


SEQ ID NO: 768 


atttgtttgtcaaagaagt 


4543 


4562 


SEQ ID NO: 1771 


acttccgtttaccagaaat 


8239 


8258 2 


3 


SEQ ID NO: 769 


gcagagcttggcctctctg 


6127 


5146 


SEQ ID NO: 1772 


cagagctttctgccactgc 


13510 


13529 2 


3 


SEQ ID NO: 770 


atatgctgaaatgaaattt 


6345 


5364 


SEQ ID NO: 1773 


aaattcaaactgcctatat 


13866 


13885 2 


3 


SEQ ID NO: 771 


tcaaaacttgacaacattt 


5412 


5431 


SEQ ID NO: 1774 


aaatacttccacaaattga 


8772 


8791 2 


3 


SEQ ID NO: 772 


cagtgacctgaaatacaat 


5504 


5523 


SEQ ID NO: 1775 


attgaacatccccaaactg 


8786 


8805 2 


3 


SEQ ID NO: 773 


tacaaatggcaatgggaaa 


5840 


5859 


SEQ ID NO: 1776 


tttcaactgcctttgtgta 

^0 ^0 


11221 


11240 2 


3 


SEQ ID NO: 774 


cttttgtaaagtatgataa 


6277 


6296 


SEQ ID NO: 1777 


ttattgctgaatccaaaag 

^0 ^0 ^0 


13648 


13667 2 


3 


SEQ ID NO: 775 


ttgtaaagtatgataaaaa 


6280 


6299 


SEQ ID NO: 1778 


ttttcaagcaaatgcacaa 

^0 


8531 


8550 2 


3 


SEQ ID NO: 776 


tccattaacctcccatttt 


6312 


6331 


SEQ ID NO: 1779 


aaaagaaaattttgctgga 


10748 


10767 2 


3 


SEQ ID NO: 777 


gattatctgaattcattca 


6480 


6499 


SEQ ID NO: 1780 


tgaagtagaccaacaaatc 


7154 


7173 2 


3 


SEQ ID NO: 778 


aattgggagagacaagttt 


6498 


6517 


SEQ ID NO: 1781 


aaactaaatgatctaaatt 


11316 


11335 2 


3 


SEQ ID NO: 779 


atttgaaaatagctattgc 


6668 


6707 


SEQ ID NO: 1782 


gcaatttctgcacagaaat 


13433 


13452 2 


3 


SEQ ID NO: 780 


tgag catgtcaaacacttt 


7052 


7071 


SEQ ID NO: 1783 


aaagccattcagtctctca 


12963 


12982 2 


3 


SEQ ID NO: 781 


ttgaagatgttaacaaatt 


7348 


7367 


SEQ ID NO: 1784 


aattccatatgaaagtcaa 


12652 


12671 2 


3 


SEQ ID NO: 782 


acttgtcacctacatttct 


7746 


7764 


SEQ ID NO: 1785 


agaatattttgatccaagt 

^0 ^0 ^0 


13268 


13287 2 


3 


SEQ ID NO: 783 


gttttccacaccagaattt 


8042 


8061 


SEQ ID NO: 1786 


aaatctggatttcttaaac 


9473 


9492 2 


3 


SEQ ID NO: 784 


ataagtacaaccaaaattt 


9397 


9416 


SEQ ID NO: 1787 


aaataaatggagtctttat 

^0^0 ^0 


14076 


14094 2 


3 


SEQ ID NO: 785 


cgggacctgcggggctgag 


0 


19 


SEQ ID NO: 1788 


ctcagttaactgtgtcccg 


11563 


11582 1 


3 


SEQ ID NO: 786 


agtgcccttctcggttgct 


17 


36 


SEQ ID NO: 1789 


agcatctgattgactcact 


12670 


12689 1 


3 


SEQ ID NO: 787 


gctgaggagcccgcccagc 39 


58 


SEQ ID NO: 1790 


gctgattgaggtatccagc 


1217 


1236 1 


3 


SEQ ID NO: 788 


gaggagcccgcccagccag 42 


61 


SEQ ID NO: 1791 


ctggatcacagagtccctc 


3744 


3763 1 


3 


SEQ ID NO: 789 


gggccgcgaggccgaggcc 64 


83 


SEQ ID NO: 1792 


ggccctgatccccgagccc 


1355 


1374 1 


3 


SEQ ID NO: 790 


ccaggccgcagcccaggag 81 


100 


SEQ ID NO: 1793 


ctcccggagccaaggctgg 


2674 


2693 1 


3 


SEQ ID NO: 791 


ggagccgccccaccgcagc 96 


115 


SEQ ID NO: 1794 


gctgttttgaagactctcc 


1080 


1099 1 


3 


SEQ ID NO: 792 


gaagaggaaatgctggaaa 


192 


211 


SEQ ID NO: 1795 


tttcaagttcctgaccttc 


8301 


8320 1 


3 


SEQ ID NO: 793 


caaaagatgcgacccgatt 


229 


248 


SEQ ID NO: 1796 


aatcttattggggattttg 


7077 


7096 1 


3 
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SEQ ID NO- 794 


diiueici^wci^A/iwi/yyelGiy 


cMO 




CCr*l 11^ Kl/^. 4 70*7 

ocU \U NU: 1797 


cttccacatttcaaggaat 


10059 


10078 1 


3 


SEQ ID NO' 7Q5 








ocQ (U NO: 1796 


ccagcaagtacctgagaac 


8602 


8621 


1 


3 


SEQ ID NO* 796 




ouo 




ocv<{ \u no: 1799 


acttgaagaaaagatagtc 


13316 13335 1 


3 


SEQ ID NO- 797 


y ly ^uauuay y aiuaaciy 






ocQ ID IMO. I0OO 


cagtgaagctgcagggcac 


«i Aon0 

10696 


10715 1 


3 


SEQ ID NO' 798 


yaiwcieioLy uooy y iiyay 




'XtiA 


obU lU InU; 1801 


ctcacctccacctctgatc 


4740 


4759 


1 


3 


SEQ ID NO- 799 


oviyudayyiiyayciyya 




OO9 


ocQ ID NO: 1802 


tccactcacatcctccagt 


1281 


1300 


1 


3 


SEO ID NO- fiOO 


n r*fr*tn r* q n ntf/^o f r» 
uueiyv,fii«iyL«ayLriiuaLC 


ooo 


Oo4 


QcQ ID NO: 1803 


gatgtggtcacctacctgg 


1335 


A A 

1354 


A 
1 


3 


SEO ID WO* 801 


aywucaiCwiyaayaCwa 


QfO 




ocQ ID NO: 1804 


tggtgctggagaatgagct 


5104 


5123 


1 


3 


SEO ID NO* 809 


uuUalUwiyoayaCCayC 


^-7*7 




ocQ ID NO: 1805 


gctggagtaaaactggaag 


2688 


2707 


1 


3 


SEO ID NO* 80? 


ccay vudy ly cau vCiy aa 




41U 


ocQ ID NO: 1806 


ttcaagatgactgcactgg 


1531 


A eery 

1550 


1 


3 


SEQ ID NO- 804 


way ly ua wwciy aaay ay y 




4T0 


ocXJ ID IMU: 1807 


cctcacagagctatcactg 


5222 


5241 


1 


3 


SEQ ID NO* 805 


lyy^Liudauowiydyyyc 


A1Q 
*f 1 C7 




obU ID NU. I0O6 


gcccactggtcgcctgcca 


o*;oc 

3525 


3544 


1 


3 


SEQ ID NO* 808 


uucaoccciy ay y y caaa 






ocQ ID NO: 1809 


tttgagccaacattggaag 


2199 


2218 


1 


3 


SEQ ID NO* 807 


li d wuw ly eiy y y a eaay 






ocU ID InU. I0IO 


ctngacaggcattttgaa 


9719 


9738 


1 


3 


SEQ ID NO* 808 


uuy o Ly oay aaaoCwaag 






ocU ID NU. I01 1 


cttgaaattcaatcacaag 


9066 


9085 


1 


3 


SEQ ID NO* 809 


tn^tn9snofii99/^f^9o/t99 
lywiydayaaaawcaayaa 


AAfi. 


ACiA 


Gcr\ m M/'^• 4 04o 
obU ID NU. I0I2 


ttctgctgccttatcagca 


C0on 

5639 


5658 


1 


3 


SEQ ID NO* 810 


iiyciycayccaiyiccay 






ocQ ID NO. 1813 


ctggtcagtttgcaagcaa 


0 A AA 

2996 


3015 


1 


3 


SEO ID NO- 811 


ly (rfiy way wca ly iccay y 




4yo 


obQ ID NO: 1814 


cctggtcagtttgcaagca 


2995 


3014 


1 


3 


SEQ ID NO- 812 


ay uudiy ii/way y idiy ay 


AM 


ou 1 


obU ID NU: 1815 


ctcacatcctccagtggct 


1285 


1304 


1 


3 


SEQ ID NO- 813 


ayoiwddyuiyycwaucw 


AQO 


010 


obU ID NU. I0I0 


ggaactaccacaaaaagct 


7481 


7500 


1 


3 


SEQ ID NO* 814 


ayaayyyaaycagyiuic 


010 


Oof 


obO ID NO: 1817 


gaaatcttcaatttattct 


13813 


13832 1 


3 


SEO ID NO* 81*1 


aayyyaayCdyyiniccl 


0£.KJ 


coo 
039 


SEQ ID NO: 1818 


aggacaccaaaataacctt 


7564 


7583 


1 


3 


SEQ ID NO* 818 

w^\K 111/ I^W* W lU 


dyddayaiyaacwiaciia 




ODD 


obQ ID NO: 1819 


taagaacfttgccacttct 


4844 


4863 


1 


3 


SEQ ID NO' 817 


diuuLydauolUaayayyy 


OD/ 


000 


ceo \V\ K\r\» HOOA 

ObU ID IMU. 1o20 


ccctaacagatttgaggat 


7969 


7988 


1 


3 


SEO ID NO* 818 


iccigddCdiCdayayyyy 


ESQ 


Oo7 


SEQ ID NO: 1821 


cccctaacagatttgagga 


7968 


7987 


1 


3 


SFO in NO* niQ 


cigaacatcaagaggggca 


670 


589 


SEQ ID NO: 1822 


tgcctgcctttgaagtcag 


7900 


7919 


1 


3 


SEQ ID NO* 850 


dacaicaagagyggcdica 


0/ o 


coo 
o9^ 


C>CO ir> KIO> <IOAO 

ObU ID NO: 1823 


tgataaaaaccaagatgtt 


6290 


6309 


1 


3 


SEO ID NO- 891 


9^94^9 9n snn M /*! 

acaicaagagg g g caiCai 


K7>l 


coo 

093 


ObQ ID NO: 1824 


atgataaaaaccaagatgt 


AAA A 

6289 


6308 


1 


3 


SEQ ID NO* 899 


lUdiiiuiyvCwicciyyi 


0O9 




ObQ ID NO: 1825 


accaccagtttgtagatga 


7405 


7424 


1 


3 


SEQ ID NO- 893 


ncucwcagagacagaaga 


0\Jf 




ocQ ID NO: 1826 


tcttccacatttcaaggaa 


10058 


10077 1 


3 


SEQ ID NO* 894 


yddyddywuddywaayiyi 


1 


D4U 


OCO \V\ Mr^. HOOT 

obU ID NU: 1827 


acaccttccacattccttc 


8071 


8090 


1 


3 


SEO ID NO* 89*? 


uyuiciggaiaccgigi 


COO 


ceo 

000 


SEQ ID NO: 1828 


acactaaatacttccacaa 


8767 


8786 


1 


3 


SEQ ID NO: 826 


tgtatggaaactgctccac 


655 


674 


SEQ ID NO: 1829 


gtggaggcaacacattaca 


2920 


2939 


1 


3 


SEQ ID NO: 827 


aaactgctccactcacttt 


662 


681 


SEQ ID NO: 1830 


aaagaaacagcatttgttt 


4532 


4551 


1 


3 


SEQ ID NO: 828 


actcactttaccgtcaaga 


672 


691 


SEQ ID NO; 1831 


tcttacttttccattgagt 


10572 


10591 


1 


3 


SEQ ID NO: 829 


ctttaccgtcaagacgagg 


677 


696 


SEQ ID NO: 1832 


cctccagotcctgggaaag 


2483 


2502 


1 


3 


SEQ ID NO: 830 


ttaccgtcaagacgaggaa 


679 


698 


SEQ ID NO: 1833 


ttcctaaagctggatgtaa 


11169 


11188 1 


3 


SEQ ID NO: 831 


acgaggaagggcaatgtgg 


690 


709 


SEQ ID NO: 1834 


ccacaagtcatcatctcgt 


6956 


5975 


1 


3 
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SEQ ID NO" 832 

^^^^^^ • ii^ I^N^t 


coaaaaaaaacaatatanr 




71 n 


SEQ ID NO: 833 


Q aaaaaa o a caata ta aca 




711 

r 1 1 


SEQ ID NO: 834 

^^^^^^ M mm w ^ # %^ ■ 


oaaaaaacaatatacicaac 


694 


71? 


SEQ ID NO: 835 


aaaaaocaatataacaaca 

y s*""y y y ^y ly y wciawGi 




714 


SEQ ID NO: 836 


caaacateaacccacttdc 


769 




SEQ ID NO: 837 


aacicatcaacccacttact 


770 




SEQ ID NO: 838 


tcaacccacttactctcat 


775 

f » w 


794 

f w*t 


SEQ ID NO: 839 


Qtcaactctaatcaacaac 


815 

W 1 W 


834 


SEQ ID NO: 840 


oaacactaaaaaa aaocat 

y y ^^^'y w»**My wy y MMywwi k 


857 


R7R 

Of w 


SEQ ID NO: 841 


aaaoaocaacacctcttcc 


894 

Ww^ 


913 

W IW 


SEQ ID NO: 842 


aaaaa caacacctcttcct 


895 

Www 


914 
w In 


SEQ ID NO: 843 


caacacctcttcctacctt 


900 


919 
w 1 y 


SEQ ID NO* 844 


aaca cntcf tfiotn i^rtH 






SEQ ID NO* 845 


Q vciciy aa laciy la lyy^ d I 






SEQ ID NO: 846 


caaaaataaatatanrmtn 


WA>0 


w*tO 


SEQ ID NO: 847 


^«y vGi vaciy iy a wci wCi y Q w 


94R 


«7Do 


SEQ ID NO: 848 


acicscaa o toa tiaoanQot 

ny vavBGy ly etwaweiyci^i 


y*T f 


QRR 


SEQ ID NO: 849 


ocacaantnanananar^tf 


w*tO 


QR7 


SEQ ID NO: 850 

^^^^ ^^^^ ■ ^^^^ ■ ^ ^B^^ V ^^^^^ 


aacttaaaaacacaccaaa 


970 

wf W 




SEQ ID NO: 851 


acttctttaatoaaaatac 

y wth vfc%by y iy Gi ny y ici^ 


1000 


101Q 


SEQ ID NO* 852 


ntttn n tn a a n f! ta of a a ri 
^oiiiy y ly aeiy y LdV^ldciy 


inn4 

1 wUt 


1 nOQ 


SEQ ID NO: 853 


tactaaoaflnnfnnnppfc 

mwkaayaayaiy yy^Olv 




lUOo 


SEQ ID NO: 854 


tttoaaaacaccaaatcca 


1 www 


1057 
1 wwr 


SEQ ID NO: 855 


aoaacaccaaatccacatc 


1042 

1 W*T«. 


10R1 
Iwv 1 


SEQ ID NO: 856 


aactattttoaaaactctc 

My v^y (ht*y GiGiyawiwkw 


107Q 

1 U ( w 


1 0«70 


SEQ ID NO: 857 


toaaaaaactaaccatrifft 


1 1 Ww 


1124 


SEQ ID NO: 658 


aaaaaaactaaccatctct 


1106 


1125 


SEQ ID NO: 859 


tctaaocaaaatatccana 

■*»*y ^y Www a El (CI iwwci y ci 


1122 


1141 


SEQ ID NO: 860 


tctcttcaataaactaatf 


114A 


11R7 


SEQ ID NO: 861 


V ty ay \« ly a y ay y bVi^/Giy 


1 1RR 


1 1R7 
1 lOr 


SEQ ID NO: 862 


taaaacaatcacatctctc 


1190 

1 i WW 


1209 


SEQ ID NO- 


aag cagicacaiClCiCii 


ny2 


izn 


SEQ ID NO: 864 


ctctcttgccacagctgat 


1204 


1223 


SEQ ID NO: 865 


tcttgccacagctgattga 


1207 


1226 


SEQ ID NO: 866 


cttgccacagctgattgag 


1208 


1227 


SEQ ID NO: 867 


tgaggtgtccagccccatc 


1223 


1242 


SEQ ID NO: 868 


tcagtgtggacagcctcag 


1259 


1278 


SEQ ID NO: 669 


acatcctccagtggctgaa 


1288 


1307 



OcU lU INVJ. loot) 


gccagaagtgagatcctcg 


3607 


3526 


1 


3 


ocU lU iNU. 1ooo 


tgccagtctccatgacctc 


2468 


2487 


1 


3 


Qco in M^^• •ift'27 


gngcicttaaggacttcc 


13356 13376 1 


3 


QPO in Mr>« ifiQfl 
OCU lU INVj. lo^o 


tgttgatgaggagtccttc 


1801 


1820 


1 


3 




gcaagtcutcctggcctg 


3011 


3030 


1 


3 




agcaagtctttcctggcct 


3010 


3029 


1 


3 


OCU lU riKJ, lOH 1 


aigaaagtcaagcatctga 


12660 


12679 1 


3 


OCViC lU INW/. iO't4 


gctgactnaaaatctgac 


4811 


4830 


1 


3 




aigcactgtttctgagtcc 


9331 


9350 


1 


3 


OCL( lU InVJ. |o44 


ggaatatcitagcatcctt 


13457 13476 1 


3 


QC/^ in 'tDAC 

ocXJ, \u iMU. 1o40 


aggaatatcttagcatcct 


13456 


13475 1 


3 




aaggctgactctgtggttg 


4284 


4303 


1 


3 


ocU lU INU. 1o47 


aaagcaggccgaagctgtt 


1067 


1086 


1 


3 


Qcr^ in Mr^» 40^10 


atccatgatctacatttgt 


6786 


6805 


1 


3 




catcactttacaagccttg 


1238 


1257 


1 


3 


QPO in mcrk 
OPL( lU IMI^. 10DU 


gtctcttcgttctatgcta 


4584 


4603 


1 


JV 

3 


ccn in Mr\« iqc4 


agtctcttcgttctatgct 


4583 


4602 


1 


3 


ocU lU iMU. loOZ 


XV XV MASX^ ^MM^hX .^kX.^M -— — ^ — 

aagtgtagtctcctggtgc 


5091 


5110 


1 


3 


QPO in MO* IQCS 


tttgaggaticcatcagtt 


7979 


7998 


1 


3 


QPn in Kin* haka 


X^ wxvxvxvXdv^^aXXX ^M 

gcacctacttttggcaagc 


8364 


8383 


1 


3 


QPO in MO* 4 Dec 


cttatgggatttcctaaag 


11159 11178 1 


3 


QPn in Kir\' 4 oca 
ocU lU NU. I00D 


gagggtagtcataacagta 


10329 


10348 1 


3 


QCO in KIO* 4 OCT 

ocU lU NU. 1oo7 


tggaagtgtcagtggcaaa 


10372 


10391 


1 


3 


QPn in kirv- m/^a 


X^ ^^X^« XM XV X 4 .^te «^ — ^ - * * 

gatggatatgaccttctci 


4868 


4887 


1 


3 




XV XV XiV XVXV.Xxs. XkX .^bX 

gagaacatactgggcagct 


5872 


5891 


1 


3 


ccn in Kir^> 4QfiA 
ocw lU IMU. I00U 


gagaaaatcaatgccttca 


7104 


7123 


1 


3 


QPn in KiPk* iflAi 

OCU lU NL/. TOO 1 


^V^^^^ xvXxvxv XV xbXXX 

agagccaggtcgagctttc 


11044 


11063 1 


3 


QPO in MD* lOAo 


tctgatgaggaaactcaga 


12252 


12271 


1 


x^ 

3 


QPO in Mo« 'iflA'a 
OCU lU IMU. 1000 


^V P^XVX^XxVXV ^V XvXXXXXX^M ^M ^^.^^ 

aacctcccattttttgaga 


6318 


6337 


1 


3 


ceo in MO* 4 0fi>4 
ocU lU IMU. 1od4 


ctgatccccgagccctcag 


1359 


1378 


1 


3 


QPO in KIO* ^QtiK. 
OCU lU iMU. 10O0 


^Mx% Al^V ^V XV xvXxvxvxvX^M .^^.^kXX.^^.^^ 

gagaaaatcaatgccttca 


7104 


7123 


1 


x% 

3 


SEQ ID NO: 1866 


aagaggcagcttctggctt 


12289 12308 1 


3 


SEQ ID NO: 1867 


atcaaaagaagcccaagag 


12938 12957 1 


3 


SEQ ID NO: 1868 


tcaaagttaattgggaaga 


12271 


12290 1 


3 


SEQ ID NO: 1869 


ctcaattttgattttcaag 


8520 


8539 


1 


3 


SEQ ID NO: 1870 


gatggaaccctctccctca 


4725 


4744 


1 


3 


SEQ ID NO: 1871 


ctgacatcttaggcactga 


4993 


5012 


1 


3 


SEQ ID NO: 1872 


ttcagaagctaagcaatgt 


7231 


7250 


1 


3 
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SEQ ID NO* 870 


gcacagcagctgcgagaga 1377 




SEQ ID NO* 871 


cagcagctgcgagagatct 


1380 


1099 


SEQ ID NO' 872 


gcgagggatcagcgcagcc 1407 




SEQ ID NO' 873 


aagacaaaccctacaggga 1470 




SEQ ID NO* 874 


caggagctgctogacatta 


1491 




SEQ ID NO' 875 


aggagctgctggacattqc 


1492 




SEQ ID NO* 876 


ctactgaacattactaatt 


1497 


1 o to 


SEQ ID NO' 877 


gattacacctatttgattc 


1657 


1 *?7R 
1 3r W 


SEQ ID NO' 878 


atttg attctgcgggtcat 


1567 


1 wOO 


SEQ ID NO: 879 


tctgcgggtcattggaaat 


1674 




SEQ ID NO: 880 


aaccatggagcagttaact 


1601 




SEQ ID NO: 881 


ggagcagttaactccaqaa 


1607 


1 v£w 


SEQ ID NO: 882 


actccagaactcaagtctt 


1617 


1636 

i wwW 


SEQ ID NO: 883 


tccagaactcaagtcttca 


1619 


1 U*jO 


SEQ ID NO: 884 


aagtacaaaQCcatcacta 


1655 


1674 

1 Of *F 


SEQ ID NO' 885 


gccatcactgataatccaa 


1664 


IDuO 


SEQ ID NO: 886 


ccatcactqatqatccaaa 


1665 




SEQ ID NO* 887 


atccagaa aactaccatcc 


1677 




SEQ ID NO' 888 


cagaaaQctQccatccaaa 


1680 




SEQ ID NO: 889 


acaaggaccaggaaattct 


1723 


1740 


SEQ ID NO' 890 


aggaccaqqaqqttcttct 


1726 

■ ■ ^^^^ 


174R 
1 f *tO 


SEQ ID NO* 891 


accagg aggttcttcttca 


1729 


1 f HO 


SEQ ID NO* 892 


tcttcagactttccttgat 


1742 


1 f O 1 


SEQ ID NO: 893 


ttcagactttccttqatqa 


1744 


17^*^ 


SEQ ID NO: 894 


gttgatgaggagtccttca 


1802 


1 o& 1 


SEQ ID NO: 895 


cttcacaggcaqatattaa 


1816 




SEQ ID NO* 896 


ttcacaggcao atattaac 


1817 


1 A^R 
1 OOw 


SEQ ID NO: 897 


^ ^ vO}^ CI la iicici woa a ci 11 




1 AA9 


SEQ ID NO: 698 


atattaacaaaattgtcca 


1828 


1ft47 

1 0*T/ 


SEQ ID NO: 899 


acaaaattgtccaaattct 


1834 




SEQ ID NO: 900 


gagcaagtgaagaactttg 


1869 


1 AAA 
1 ooo 


SEO ID Kin* Qni 


gtgaagaactltgtggctt 


1875 


1894 


SEQ ID NO: 902 


agaactttgtggcttccca 


1879 


1898 


SEQ ID NO: 903 


tttgtggcttcccatattg 


1884 


1903 


SEQ ID NO: 904 


tggcttcccatattgccaa 


1888 


1907 


SEQ ID NO: 905 


ttcccatattgccaatatc 


1892 


1911 


SEQ ID NO: 906 


tcccatattgccaatatct 


1893 


1912 


SEQ ID NO: 907 


ttgccaatatcttgaactc 


1900 


1919 



SEQ ID NO: 1873 tctctgaaagacaacgtgc 12315 12334 1 3 

SEQ ID NO: 1874 agataacattaaacagctg 13043 13062 1 3 

SEQ ID NO: 1875 ggctcaacacagacatcgc 5710 5729 1 3 

SEQ ID NO: 1876 tcccagaaaacctcttctt 3928 3947 1 3 

SEQ ID NO: 1877 caatggagagtccaacctg 4652 4671 1 3 

SEQ ID NO: 1878 gcaagggttcactgttcct 7856 7875 1 3 

SEQ ID NO: 1879 aattgggaagaagaggcag 12279 12298 1 3 

SEQ ID NO: 1880 gaatattttgagaggaatc 6345 6364 1 3 

SEQ ID NO: 1881 atgaagtagaccaacaaat 7153 7172 1 3 

SEQ ID NO: 1882 atttgtaagaaaatacaga 6428 6447 1 3 

SEQ ID NO: 1883 agtttctccatcctaggtt 9954 9973 1 3 

SEQ ID NO: 1884 ttctgaaaatccaatclcc 8392 8411 1 3 

SEQ ID NO: 1885 aagatcgcagactttgagt 11646 11665 1 3 

SEQ ID NO: 1886 tgaactcagaagaattgga 1912 1931 1 3 

SEQ ID NO: 1887 cagtcatgtagaaaaactt 4421 4440 1 3 

SEQ ID NO: 1888 ctggaactctctccatggc 10875 10894 1 3 

SEQ ID NO: 1889 tctgaactcagaaggatgg 13991 14010 1 3 

SEQ ID NO: 1 890 ggatttcctaaagctggat 1 1 1 65 1 1 1 84 1 3 

SEQ ID NO: 1891 cctgaaatacaatgctctg 5510 6529 1 3 

SEQ ID NO: 1892 agaaacagcatttgtttgt 4534 4553 1 3 

SEQ ID NO: 1893 agaagctaagcaatgtcct 7234 7253 1 3 

SEQ ID NO; 1894 tgaaggctgactctgtggt 4282 4301 1 3 

SEQ ID NO: 1895 atcaggaagggctcaaaga 2559 2578 1 3 

SEQ ID NO: 1896 tcattactcctgggctgaa 11299 11318 1 3 

SEQ ID NO: 1897 tgaatctggctccctcaac 9038 9057 1 3 

SEQ ID NO: 1898 ttaatcgagaggtatgaag 7140 7159 1 3 

SEQ ID NO: 1899 gttaatcgagaggtatgaa 7139 7158 1 -3 

SEQ ID NO: 1900 aattgcattagatgatgcc 6581 6600 1 3 

SEQ ID NO: 1901 tggagtttgtgacaaalat 2752 2771 1 3 

SEQ ID NO: 1902 agaaacagcatttgtttgt 4534 4553 1 3 

SEQ ID NO: 1903 caaatgacatgatgggctc 5326 5345 1 3 

SEQ ID NO: 1904 aagcatctgattgactcac 12669 12688 1 3 

SEQ ID NO: 1905 tgggcctgccccagattct 8901 8920 1 3 

SEQ ID NO: 1906 caataagatcaatagcaaa 8990 9009 1 3 

SEQ ID NO: 1907 ttggclcacatgaaggcca 7623 7642 1 3 

SEQ ID NO: 1908 gatatacactagggaggaa 12737 12756 1 3 

SEQ ID NO: 1909 agatcaaagttaattggga 12268 12287 1 3 

SEQ ID NO: 1910 gagtcccagtgcocagcaa 9344 9363 1 3 
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SEQ ID NO; 


908 


ttggatatccaagatctga 


1926 


1945 


SEQ ID NO: 1911 


tcagtataagtacaaccaa 


9392 


9411 1 


3 


SEQ ID NO: 


909 


tccaagatctgaaaaagtt 


1933 


1952 


SEQ ID NO: 1912 


aacttccaactgtcatgga 


1978 


1997 1 


3 


SEQ ID NO: 


910 


ctgaaaaagttagtgaaag 


1941 


1960 


SEQ ID NO: 1913 


ctttgaagtcagtcttcag 


7907 


7926 1 


3 


SEQ ID NO: 


911 


agttagtgaaagaagttct 


1948 


1967 


SEQ ID NO: 1914 


agaatctcaacttccaact 


1970 


1989 1 


3 


SEQ ID NO: 


912 


aatctcaacttccaactgt 


1972 


1991 


SEQ ID NO: 1915 


acaggggtcctttatgatt 


12342 


12361 1 


3 


SEQ ID NO: 


913 


gtcatggacttcagaaaat 


1989 


2008 


SEQ ID NO: 1916 


atttgaaagaataaatgac 


7028 


7047 1 


3 


SEQ ID NO: 


914 


tcaactctacaaatctgtt 


2021 


2040 


SEQ ID NO: 1917 


aacacattgaggctattga 


6970 


6989 1 


3 


SEQ ID NO: 


915 


aactctacaaatctgtttc 


2023 


2042 


SEQ ID NO: 1918 


gaaaaaggggattgaagtt 


10276 


10295 1 


3 


SEQ ID NO: 


916 


aaatagaagggaatcttat 


2071 


2090 


SEQ ID NO: 1919 


ataagcaaactgttaattt 


5449 


5468 1 


3 


SEQ ID NO: 


917 


agaagggaatcttatattt 


2075 


2094 


SEQ ID NO: 1920 


aaatgcactgctgcgttct 


4892 


4911 1 


3 


SEQ ID NO: 


918 


gaagggaatcttatatttg 


2076 


2095 


SEQ ID NO: 1921 


caaaaacattttcaacttc 


5279 


5298 1 


3 


SEQ ID NO: 


919 


tgatccaaataactacctt 


2093 


2112 


SEQ ID NO: 1922 


aaggaagaaagaaaaatca 


3453 


3472 1 


3 


SEQ ID NO: 


920 


tggatttgcttcagctgac 


2150 


2169 


SEQ ID NO: 1923 


gtcagcccagttccttcca 


10924 


10943 1 


3 


SEQ ID NO: 


921 


tttgcttcagctgacctca 

w w w 


2154 


2173 


SEQ ID NO: 1924 


tgaggaaactcagatcaaa 


12257 


12276 1 


3 


SEQ ID NO: 


922 


cttggaaggaaaaggcttt 


2183 


2202 


SEQ ID NO: 1925 


aaagcattggtagagcaag 


7842 


7861 1 


3 


SEQ ID NO: 


923 


tggaaggaaaaggctttga 

wW WW WW w 


2185 


2204 


SEQ ID NO: 1926 


tcaagtctgtgggattcca 


4078 


4097 1 


3 


SEQ ID NO: 


924 


ggctttgagccaacattgg 


2196 


2215 


SEQ ID NO: 1927 


ccaagaggtatttaaagcc 


12950 


12969 1 


3 


SEQ ID NO: 


925 


tgagccaacattggaagct 

^0 W WW w 


2201 


2220 


SEQ ID NO: 1928 


agctttctgccactgctca 

^0 ^0 ^0 


13513 


13532 1 


3 


SEQ ID NO: 


926 


gagccaacattggaagctc 


2202 


2221 


SEQ ID NO: 1929 


gagctttctgccactgctc 


13512 


13531 1 


3 


SEQ ID NO: 


927 


aacattggaagctcttttt 


2207 


2226 


SEQ ID NO: 1930 


aaaagaaacagcatttgtt 

m^ mm mm 


4531 


4550 1 


3 


SEQ ID NO: 


928 


tggaagctctttttgggaa 


2212 


2231 


SEQ ID NO: 1931 


ttccggcacgtgg gttcca 

m0m^ %0 \0 m^ mm 


3777 


3796 1 


3 


SEQ ID NO: 


929 


ctctttttgggaagcaagg 


2218 


2237 


SEQ ID NO: 1932 


ccttactgactttgcagag 


7790 


7809 1 


3 


SEQ ID NO: 


930 


tttttgggaagcaaggatt 


2221 


2240 


SEQ ID NO: 1933 


aatcattgaaaaattaaaa 

^m 


6722 


6741 1 


3 


SEQ ID NO: 


931 


ttttcccagacagtgtcaa 

\0 w w 


2239 


2258 


SEQ ID NO: 1934 


ttgatgaaatcattgaaaa 

%0 w w 


6715 


6734 1 


3 


SEQ ID NO: 


932 


ttggctataccaaagatga 

WW w w 


2323 


2342 


SEQ ID NO: 1935 


tcattgctcccggagccaa 

W WW w 


2668 


2687 1 


3 


SEQ ID NO: 


933 


ataccaaagatgataaaca 

w w 


2329 


2348 


SEQ ID NO: 1936 


tgttgcttttgtaaagtat 

WW w w 


6272 


6291 1 


3 


SEQ ID NO: 


934 


gagcaggatatggtaaatg 


2349 


2368 


SEQ ID NO: 1937 


catttcagccttcgggctc 


4254 


4273 1 


3 


SEQ ID NO: 


935 


atggtaaatggaataatgc 


2358 


2377 


SEQ ID NO: 1938 


gcatgcctagtttctccat 


9946 


9965 1 


3 


SEQ ID NO: 


936 


tggtaaatggaataatgct 

WW WW w 


2359 


2378 


SEQ ID NO: 1939 


agcacagtacgaaaaacca 

\0 W w 


10801 


10820 1 


3 


SEQ ID NO: 


937 


taaatggaataatgctcag 

WW w w 


2362 


2381 


SEQ ID NO: 1940 


ctgaaagagatgaaattta 

^0 ^0 'm0 


13059 


13078 1 


3 


SEQ ID NO: 


938 


tggaataatgctcagtgtt 


2366 


2385 


SEQ ID NO: 1941 


aacagatttgaggattcca 

^0 m0 m0 m0 


7973 


7992 1 


3 


SEQ ID NO: 


939 

V W V 


tcaaiattaaaaaoctaat 


2377 


2396 


SEQ ID NO- 1942 


atcacaactcctccactaa 


9534 


9553 1 

w www 1 


3 

w 


SEQ ID NO: 


940 


cagtgttgagaagctgatt 


2378 


2397 


SEQ ID NO: 1943 


aatcacaactcctccactg 


9533 


9552 1 


3 


SEQ ID NO: 


941 


agtgttgagaagctgatta 


2379 


2398 


SEQ ID NO: 1944 


taatcacaactcctccact 


9532 


9551 1 


3 


SEQ ID NO: 


942 


gattaaagatttgaaatcc 


2393 


2412 


SEQ ID NO: 1945 


ggatactaagtaccaaatc 


6866 


6885 1 


3 


SEQ ID NO: 


943 


gatttgaaatccaaagaag 


2400 


2419 


SEQ ID NO: 1946 


cttccgtttaccagaaatc 


8240 


8259 1 


3 


SEQ ID NO: 


944 


atttgaaatccaaagaagt 


2401 


2420 


SEQ ID NO: 1947 


acttccgtttaccagaaat 


8239 


8258 1 


3 


SEQ ID NO: 


946 


atccaaagaagtcccggaa 


2408 


2427 


SEQ ID NO: 1948 


ttccaatttccctgtggat 


3680 


3699 1 


3 
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SEQ ID NO: 946 


tccaaagaagtcccggaag 


2409 


2428 


SEQ ID NO: 1949 


cttccaatttccctgtgga 


3679 


3698 


1 


3 


SEQ ID NO: 947 


agagcctacctccgcatct 


2430 


2449 


SEQ ID NO: 1950 


agattaatccgctggctct 


8563 


8582 


1 


3 


SEQ ID NO: 948 


gagcctacctccgcatctt 


2431 


2450 


SEQ ID NO: 1951 


aagattaatccgctggctc 


8562 


8581 


1 


3 


SEQ ID NO: 949 


cttgggagaggagcttggt 


2447 


2466 


SEQ ID NO: 1952 


accactgggacctaccaag 


12519 


12538 1 


3 


SEQ ID NO: 950 


ggagcttggttttgccagt 


2456 


2475 


SEQ ID NO: 1953 


actggtggcaaaaccctcc 


2726 


2745 


1 


3 


SEQ ID NO: 951 


ttggttttgccagtctcca 


2461 


2480 


SEQ ID NO: 1954 


tggagaagccacactccaa 


10763 


10782 1 


3 


SEQ ID NO: 952 


cagtctccatgacctccag 


2471 


2490 


SEQ ID NO: 1955 


ctggtcgcctgccaaactg 


3530 


3549 


1 


3 


SEQ ID NO: 953 


ctccatgacctccagctcc 


2475 


2494 


SEQ ID NO: 1956 


ggagtcattgctcccggag 


2664 


2683 


1 


3 


SEQ ID NO: 954 


ctgggaaagctgcttctga 


2493 


2512 


SEQ ID NO: 1957 


tcagaaagctaccttccag 


7931 


7950 


1 


3 


SEQ ID NO: 955 


gaggtcatcaggaagggct 


2553 


2572 


SEQ ID NO: 1958 


agccagaagtgagatcctQ 


3506 


3525 


1 


3 


SEQ ID NO: 956 


aagaatgacttttttcttc 


2574 


2593 


SEQ ID NO: 1959 


gaaggcatctgggagtctt 


3827 


3846 


1 


3 


SEQ ID NO: 957 


cttttttcttcactacatc 


2582 


2601 


SEQ ID NO: 1960 


gatgcttacaacactaaag 


6099 


6118 


1 


3 


SEQ ID NO: 958 


catcttcatggagaatgcc 


2597 


2616 


SEQ ID NO: 1961 


ggcacttccaaaattgatg 


10710 


10729 1 


3 


SEQ ID NO: 959 


cttcatggagaatgccttt 


2600 


2619 


SEQ ID NO: 1962 


aaagttaattgggaagaag 


12273 


12292 1 


3 


SEQ ID NO: 960 


aatgcctttgaactcccca 


2610 


2629 


SEQ ID NO: 1963 


tgggctggcttcagccatt 


5729 


5748 


1 


3 


SEQ ID NO: 961 


gcctttgaactccccactg 


2613 


2632 


SEQ ID NO: 1964 


cagtctgaacattgcaggc 


5375 


5394 


1 


3 


SEQ ID NO: 962 


caaggctggagtaaaactg 


2684 


2703 


SEQ ID NO: 1965 


cagtgcaacgaccaacttg 


5072 


5091 


1 


3 


SEQ ID NO: 963 


tggagtaaaactggaagta 


2690 


2709 


SEQ ID NO: 1966 


tactccaacgccagctcca 


3051 


3070 


1 


3 


SEQ ID NO: 964 


ggaagtagccaacatgcag 


2702 


2721 


SEQ ID NO: 1967 


ctgccatctcgagagttcc 


4098 


4117 


1 


3 


SEQ ID NO: 965 


tttgtgacaaatatgggca 


2757 


2776 


SEQ ID NO: 1968 


tgcctttgtgtacaccaaa 


11228 


11247 1 


3 


SEQ ID NO: 966 


tgtgacaaatatgggcatc 


2769 


2778 


SEQ ID NO: 1969 


gatgggtctctacgccaca 


4377 


4396 


1 


3 


SEQ ID NO: 967 


ggacttcgctaggagtggg 


2786 


2805 


SEQ ID NO: 1970 


cccaaggccacaggggtcc 


12333 


12352 1 


3 


SEQ ID NO: 968 


gtggggtccagatgaacac 


2800 


2819 


SEQ ID NO: 1971 


gtgttctagacctctccac 


4171 


4190 


1 


3 


SEQ ID NO: 969 


ttccacgagtcgggtctgg 


2826 


2645 


SEQ ID NO: 1972 


ccagaatctgtaccaggaa 


12554 


12573 1 


3 


SEQ ID NO: 970 


agtcgggtctggaggctca 


2833 


2852 


SEQ ID NO: 1973 


tgagaactacgagctgact 


4799 


4818 


1 


3 


SEQ ID NO: 971 


tcgggtctggaggctcatg 


2835 


2854 


SEQ ID NO: 1974 


catgaaggccaaattccga 


7631 


7650 


1 


3 


SEQ ID NO: 972 


aaaagctgggaagctgaag 


2861 


2880 


SEQ ID NO: 1975 


cttccagacacctgatttt 


7943 


7962 


1 


3 


SEQ ID NO: 973 


aagctgaagtttatcattc 


2871 


2890 


SEQ ID NO: 1976 


gaatttacaattgttgctt 


6261 


6280 


1 


3 


SEQ ID NO: 974 


gagaccagtcaagctgctc 


2900 


2919 


SEQ ID NO: 1977 


gagcttcaggaagcttctc 


13206 


13225 1 


3 


SEQ ID NO: 975 


gcaacacattacatttggt 


2926 


2945 


SEQ ID NO: 1978 


accagtcagatattgttgc 


10183 


10202 1 


3 


SEQ ID NO: 976 


acattacatttggtctcta 


2931 


2950 


SEQ ID NO: 1979 


tagaatatgaactaaatgt 


11881 • 


11900 1 


3 


SEQ ID NO: 977 


cattacatttggtctctac 

WW 


2932 


2951 


SEQ ID NO: 1980 


gtagctgagaaaatcaatg 


7098 


7117 


1 


3 


SEQ ID NO: 978 


aaacggaggtgatcccacc 


2956 


2975 


SEQ ID NO: 1981 


ggtggataccctgaagttt 


3197 


3216 


1 


3 


SEQ ID NO: 979 


attgagaacaggcagtcct 


2979 


2998 


SEQ ID NO: 1982 


aggaaaagcgcacctcaat 


12023 12042 1 


3 


SEQ ID NO: 980 


tgagaacaggcagtcctgg 


2981 


3000 


SEQ ID NO: 1983 


cx^agcttccccacatctca 


8333 


8352 


1 


3 


SEQ ID NO: 961 


ctgcacctcaggcgcttac 


3035 


3054 


SEQ ID NO: 1984 


gtaagaaaatacagagcag 


6432 


6451 


1 


3 


SEQ ID NO: 982 


tccacagactccgcctcct 


3066 


3085 


SEQ ID NO: 1985 


aggacagagccttggtgga 


3184 


3203 


1 


3 


SEQ ID NO: 983 


ctgaccggggacaccagat 


3093 


3112 


SEQ ID NO: 1986' 


atctgatgaggaaactcag 


12251 


12270 1 


3 
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SEO ID NO" QRd 


tagagctggaactgaggcc 


3112 


3131 


SEQ ID NO: 1987 


ggcx^tctctggggcatcta 


5136 


5155 


1 


w 




ctatgagctocagagagag 


3167 


3186 


SEQ ID NO: 1988 


A A MM 

ctctcacaaaaaagtatag 


6541 


6560 


1 


3 




cttggtggataccctgaag 


3194 


3213 


SEQ ID NO: 1989 


cttcaggaagcttctcaag 


13209 


13228 1 


3 


5?FO in Wn- QR7 


ttgtaactcaagcagaagg 


3214 


3233 


SEQ ID NO: 1990 


ccttacacaataatcacaa 


9522 


9541 


1 


3 


SEO ID NO* QRA 


taactcaagcagaaggtgc 


3217 


3230 


SEQ ID NO: 1991 


gcacctagctggaaagtta 


6947 


6966 


1 


3 


SFO ID ND* QRQ 


gcagaaggtgcgaagcaga 3225 


3244 


SEQ ID NO: 1992 


tctgtgggattccatctgc 


4083 


4102 


1 


3 


SEO ID NO- QQO 


cagaaggtgcgaagcagac 3226 




SEQ ID NO: 1993 


gtctgtgggattccatctg 


4082 


4101 


1 


3 


SEO ID NO* 991 


gfatgaccttgtccagtga 


3280 




ocQ ID NO: 1994 


tcaccaacggagaacatac 


10843 


10862 1 


3 


SEQ ID NO- 992 


tatgaccttgtccagtgaa 


3281 


OOUU 


otQ lU NO. 1995 


ttcaccaacggagaacata 


10842 


10861 


1 


3 


SEO ID NO* 99:^ 


gaagtccaaattccggatt 


3297 


oolo 


ocQ ID NO: 1996 


aatctcaagctttctcttc 


10044 10063 1 


3 


SEQ ID NO* 994 


gagggcaaaacgtcttaca 


3363 




ocQ ID NO: 1997 


tgtacaactggtccgcctc 


4207 


4226 


1 


3 


SEQ ID NO* 995 


agggcaaaacgtcttacag 


3364 


OOOO 


bhU ID NO: 1998 


ctgttaggacaccagccct 


4064 


4073 


1 


3 


SEQ ID NO- 996 


gactcaccctggacattca 


3362 


3401 


ocQ ID NO: 1999 


tgaaattcaatcacaagtc 


9068 


9087 


1 


3 




ctggacattcagaacaaga 


3390 


3409 


SEQ ID NO: 2000 


tcttttcttttcagcccag 


9218 


9237 


1 


3 


SEQ ID NO: 998 


tcatgggcgacctaagttg 


3427 


3446 


SEQ ID NO: 2001 


caactgcagacatatatga 


6627 


6646 


1 


3 


SEQ ID NO: 999 


tgggcgacctaagttgtga 


3430 


3449 


SEQ ID NO: 2002 


tcactccattaacctccca 


6308 


6327 


1 


3 


SEQ ID NO: 1000 


agttgtgacacaaaggaag 


3441 


3460 


SEQ ID NO: 2003 


cttcttttccaattgaact 


13830 


13849 1 


3 


SEQ ID NO: 1001 


tgacacaaaggaagaaaga 3446 


3465 


SEQ ID NO: 2004 


tcttcatcttcatctgtca 


10212 10231 


1 


3 


SEQ ID NO: 1002 


gacacaaaggaagaaagaa 3447 


3466 


SEQ ID NO: 2005 


ttcttcatcttcatctgtc 


10211 


10230 1 


3 


SEQ ID NO: 1003 


ggaagaaagaaaaatcaag 3455 


3474 


SEQ ID NO: 2006 


cttgtcatgcctacgttcc 


11340 11359 1 


3 



SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



2oo79®2i9agcgatggccgggtc 3947 

200dgtcaaatataccttgaaca 3963 

2009tgaacaagaacagtttgaa 3976 

201 Oagtttgaaaattgagattc 3987 

201 1 gtttgaaaattgagattcc 3988 

2012ttgaaaattgagattcctt 3990 

20 1 3ctaaagatgttagagactg 4038 

201 4atgttagagactgttagga 4044 

20 1 5cagccctccacttcaagtc 4066 

201 eagccctccacttcaagtct 4067 

201 7ccatctgccatctagagag 4094 

20 1 8 attcccaagtfgtatcaac 41 34 

201 gtcaactgcaagtgcctctc 4148 

2020 ggtgttctagacctctcca 41 70 

2021 ctccacgaatgtctacagc 41 84 
2022cacgaatgtctacagcaac 4187 
2023acgaatgtctacagcaact 41 88 
2024tcctacagtggtggcaaca 4224 
2025cgttaccacatgaaggctg 4272 
2026 gaaggctgactctgtggtl 4283 
2027 tgtggttgacctgctttcc 4295 
2028 cctgctttcctacaatgtg 4304 



39663EQ 

3982 SEQ 
3995SEQ 
4006SEQ 
4007 SEQ 
4009SEQ 
4057SEQ 
4063SEQ 
4085SEQ 
4086SEQ 
4113SEQ 
4153SEQ 
4167SEQ 
4189SEQ 
4203SEQ 
4206 SEQ 
4207SEQ 
4243 SEQ 
4291 SEQ 
4302SEQ 
4314SEQ 
4323SEQ 



ID NO: 2313gaccttgcaagaatatttt 

ID NO: 2314tgttaacaaattccttgac 

ID NO: 231 5ttcaagttcctgaccttca 

ID NO: 2316gaatetggctccctcaact 

ID NO; 2317ggaaataccaagtcaaaac 

ID NO; 2318aaggaaaagcgcacctcaa 

ID NO: 2319cagttgac3cacaagcttag 

ID NO: 2320tccttaaoaccttccacat 

ID NO: 2321 gacttctctagtcaggctg 

ID NO: 2322agacatcgctgggctggct 

ID NO: 2323ctctcaaatgacatgatgg 

ID NO: 2324gttgagaagccccaagaat 

ID NO: 2325gagatcaagacactgttga 

ID NO: 2326tggaaccctctccctcacc 

ID NO: 2327gctggtaacctaaaaggag 

ID NO: 2328gttgcccaccatcatcgtg 

ID NO: 2329agttgcccaccatcatcgt 

ID NO: 2330tgttagttgctcttaagga 

ID NO: 2331 cagcaagtacctgagaacg 

ID NO: 2332aacctatgccttaatcttc 

ID NQ: 2333ggaaagttaaaacaacaca 

ID NO: 2334cacaccttgacattgcagg 



6335 6354 1 3 

7355 7374 1 3 

8302 8321 1 3 

9039 9058 1 3 

1044610466 1 3 

1202212041 1 3 

1053710556 1 3 

8065 8084 1 3 

8805 8824 1 3 

5720 5739 1 3 

5322 5341 1 3 

6246 6265 1 3 

8835 8854 1 3 

4727 4746 1 3 

5580 5599 1 3 

1166311682 1 3 

1166211681 1 3 

1335113370 1 3 

8603 8622 1 3 

1316113180 1 3 

6957 6976 1 3 

1108011099 1 3 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



2029ctgctttcctacaatgtgc 4305 
2030tcctacaatgtgcaaggat 431 1 

2031 tatgaccacaagaatacgt 4344 

2032 atgaccacaagaatacgtc 4345 
2033 gaatacgtctacactatca 4355 
2034tttctagattcgaatatca 4398 
2035 gattcgaatatcaaattca 4404 
2036gaaacaacccagtctcaaa 4441 
2037cccagtctcaaaaggttta 4448 
2038 ctcaaaaggtttactaata 4454 
2039tcaaaaggtttactaatat 4455 
2040aaaaggtttactaatattc 4457 

2041 gaaacagcatttgtttgtc 4535 

2042 atttgtttgtcaaagaagt 4543 
2043tcaagattgatgggcagtt 4561 
2044ttcagagtctcttcgttct 4578 
2045 cagagtctcttcgttctat 4580 
2046atgctaaaggcacatatgg 4597 
2047 gcacatatggcctgtcttg 4606 
2046ga9tccaacctgaggttta 4659 
2049agtccaacctgaggtttaa 4660 
2050cctacctccaaggcaccaa 4684 
2051 gaagatggaaccctctccc 4722 
2052tgatctgcaaagtggcatc 4754 
2053gatctgcaaagtggcatca 4755 
2054gcttccctaaagtatgaga 4785 
2055gtatgagaactacgagctg 4796 
2056tctaacaagatggatatga 4860 
2057 ctgctgcgttctgaatatc 4899 
205Btcattgaggttcttcagcc 4932 
2059ttctggatcactaaattcc . 4955 
2060ccatggtcttgagttaaat 4973 
2061 tcttaggcactgacaaaat 4999 
2062acaaggcgacactaaggat 5032 
2063tgcaacgaccaacttgaag 5075 
2064 caacttgaagtgtagtctc 6084 
2065gctggagaatgagctgaat 5108 
2066gcagagcttggcctctctg 51 27 
2067tctctggggcatctatgaa 5140 
2068tctggggcatctatgaaat 5142 
2069aacacaatgcaaaattcag 51 85 
2070ctcacagagctatcactgg 5223 
2071 tgggaagtgcttatcaggc 5239 
2072ttcaaggtcagtcaagaag 5295 
2073aatgacatgatgggctcat 5328 
2074gctcatatgctgaaatgaa 5341 
2075 atatgctgaaatgaaatU 5345 
2076tctgaacattgcaggctta 5378 
2077gaacattgcaggcttatca 5381 
2078tgcaggcttatcactggac 5387 



4324SEQ ID NO 
4330SEQ ID NO 
4363SEQ ID NO 
4364SEQ ID NO 
4374SEQ ID NO 
4417SEQ ID NO 
4423SEQ ID NO 
4460SEQ ID NO 
4467SEQ ID NO 
4473SEQ ID NO 
4474SEQ ID NO 
4476 SEQ ID NO 
4554SEQ ID NO 
4562 SEQ ID NO 
4580SEQ ID NO 
4597SEQ ID NO 
4599SEQ ID NO 
4616SEQ ID NO 
4625 SEQ ID NO 
4678SEQ ID NO 
4679SEQ ID NO 
4703SEQ ID NO 
4741 SEQ ID NO 
4773SEQ ID NO 
4774SEQ ID NO 
4804SEQ ID NO 
4815SEQIDNO 
4879SEQ ID NO 
4918SEQID NO 
4951SEQ ID NO 
4974SEQ ID NO 
4992SEQ ID NO 
5018SEQ ID NO 
5051SEQIDNO 
6094SEQ ID NO 
5103SEQ ID NO 
5127SEQ ID NO 
5146SEQ ID NO 
5159SEQ ID NO 
5161SEQIDNO 
5204SEQ ID NO 
5242SEQ ID NO 
5258SEQ ID NO 
5314SEQ ID NO 
5347SEQ ID NO 
5360SEQ ID NO 
5364SEQ ID NO 
5397SEQ ID NO 
5400SEQ ID NO 
5406SEQ ID NO 



2335 gcacaccttgacattgcag 

2336 atccgctggctctgaagga 

2337 acgtccgtgtgccttcata 

2338gacgtccgtgtgccttcat 

2339tgattatctgaattcattc 

2340tgatttacatgatttgaaa 

2341 tgaagtagctgagaaaatc 

2342tttgaaaaattctcttttc 

2343taaattcattactcctggg 

2344tattcaaaactgagttgag 

2345 atattcaaaactgagttga 

2346gaatttgaaagttcgtttt 

2347gacagcatcttcgtgtttc 

2348 acttaaaaaatataaaaat 

2349 aactctcaagtcaagttga 

2350 agaagatggcaaatttg aa 

2351 atagcatggacttcttctg 

2352 ccatttgagatcacggcat 

2353caagttggcaagtaagtgc 

2354taaagtgccacttttactc 

2355 ttaacaggg aagatag act 

2356ttggcaagtaagtgctagg 

2357 gggaagaagaggcagcttc 

2358 gatgaggaaactcagatca 

2359tgatgaggaaactcagatc 

2360tctcgtgtctaggaaaagc 

2361 cagcttaagagacacatac 

2362tcattttccaactaataga 

2363 gatacaagaaaaactgcag 

2364ggctcatatgctgaaatga 

2365ggaaggacaaggcccagaa 

2366 atttttattcctgccatgg 

2367attttttgcaagttaaaga 

2368atccatgatctacatttgt 

2369 cttcagggaacacaatgca 

2370gagatgagagatgccgttg 

2371 attctcttttcttttcagc 

2372 cagatacaagaaaaactgc 

2373ttcattcaattgggagaga 

2374atttgtaagaaaatacaga 

2375 ctgaagcattaaaactgtt 

2376ccagatgctgaacagtgag 

2377 gcctacgttccatgtccca 

2378 cttcagtgcagaatatgaa 

2379 atgattatctgaattcatt 

2380ttcagccattgacatgagc 

2381 aaatagctattgctaatat 

2382taagaaccagaagatcaga 

2383tgatatcgacgtgaggttc 

2384gtcctggattccacatgca 



1107911098 1 3 

8569 8588 1 3 

9976 9995 1 3 

9975 9994 1 3 

6479 6498 1 3 

6677 6696 1 3 

7094 7113 1 3 

9206 9225 1 3 

1129411313 1 3 

1222312242 1 3 

1222212241 1 3 

9272 9291 1 3 

1120611225 1 3 

8014 8033 1 3 

1341413433 1 3 

1198712006 1 3 

8865 8884 1 3 

9237 9256 1 3 

9364 9383 1 3 

6182 6201 1 3 

9300 9319 1 3 

9368 9387 1 3 

1228312302 1 3 

1225512274 1 3 

1225412273 1 3 

5969 5988 1 3 

6912 6931 1 3 

1302413043 1 3 

6893 6912 1 3 

5340 5359 1 3 

1254112560 1 3 

1009610114 1 3 

1401114030 1 3 

6786 6805 1 3 

5177 5196 1 3 

6231 6250 1 3 

9214 9233 1 3 

6891 6910 1 3 

6491 6510 1 3 

6428 6447 1 3 

7498 7517 1 3 

8141 8160 1 3 

1134811367 1 3 

1196911988 1 3 

6478 6497 1 3 

5738 5757 1 3 

6694 6713 1 3 

1098811007 1 3 

1248212501 1 3 

1184411863 1 3 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ (D NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



2079 tcaaaacttgacaacattt 54 1 2 

2080 atttacagctctgacaagt 5427 

2081 ctctgacaagttttataag 5435 
2082 gttaatttacagctacagc 5460 
2083ttctctggtaactactttB 5483 
2084cctaaaaggagcctaccaa 5588 
2085aaaaggagcctaccaaaat 5591 
2086agga9cctaccaaaataat 5594 
2067 ataatgaaataaaacacat 5608 
2088aaaacacatctatgccatc 5618 
20S9tgctaaggttcagggtgtg 5678 
2090gagtttagccatcggctca 5697 
2091 gctggcttcagccattgac 5732 
2092atttcagcaatgtcttccg 5782 
2093tttcagcaatgtcttccgt 5783 
2094ttcagcaatgtcttccgtt 5784 
2095cagcaatgtctlccgttct 5786 
2096tgtcttccgttctgtaatg 5792 
2097gtctlccgttctgtaatgg 5793 
2096atgggaaactcgctctctg 585 1 
2099ggagaacatactgggcagc 5871 
2100gttgaaagcagaacctctg 5906 
21 01 gtctaggaaaagcalcagt 5975 
2 1 02agcatcagtgGagctcttg 5985 
2 1 03ttgaacacaaagtcagtgc 6001 
2 1 04 gcagacaggcacctggaaa 6038 
2 1 0Sgaaactcaagacccaattt 6053 
2106acaatgaatacagccagga 6076 
21 07cttggatgcttacaacact 6095 
21 08ttggcgtggagcttactgg 6124 
2 1 0Scacttttactcagtgagcc 6190 
21 lOtltagagatgagagatgcc 6227 

2111 gagaagccccaagaattta 6249 

2112 caattgttgcttttgtaaa 6268 
2 1 1 3ttttgtaaagtatgataaa 6278 
2 1 14ttgtaaagtatgataaaaa 6280 
2115 ttcactccattaacctocc 6307 
2 1 lettttgagaccttgcaagaa 6329 
21 17accttgcaagaatattltg 6336 
21 IStcaatattgatcaatttgt 6415 
21 19cagagcagccctgggaaaa 6443 
21 20cctgggaaaactcccacag 6452 
2 1 21 actccoacagcaagctaat 6461 
21 22aattcattcaattgggaga 6489 
2123ttcaattgggagagacdag 6495 
21 24aggagaaactgactgctct 6526 
2125actgactgctctcacaaaa 6533 
21 26gactgctctcacaaaaaag 6536 
2 1 27cagacatatatgatacaat 6633 
2 1 28aatttgatcagtatattaa 6649 



5431 SEQ 
5446 SEQ 
5454 SEQ 
5479 SEQ 
5502 SEQ 
5607 SEQ 
5610 SEQ 
561 3 SEQ 
5627 SEQ 
5637 SEQ 
5697SEQ 
5716SEQ 
5751 SEQ 

5801 SEQ 

5802 SEQ 
5803 SEQ 
5805 SEQ 
5811 SEQ 
5B12SEQ 
5870 SEQ 
5B90SEQ 
5925SEQ 
6994 SEQ 
6004 SEQ 
6020 SEQ 
6057SEQ 
6072SEQ 
6095SEQ 
6114SEQ 
6143SEQ 
6209SEQ 
6246 SEQ 
6268 SEQ 
6287 SEQ 
6297 SEQ 
6299 SEQ 
6326 SEQ 
6348 SEQ 
6355 SEQ 
6434 SEQ 
6462 SEQ 
6471 SEQ 
6480SEQ 
6508 SEQ 
6514 SEQ 
6545 SEQ 
6552 SEQ 
6555 SEQ 
6652 SEQ 
6668 SEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



2385aaattccttgacatgttga 

2386acttaaaaaatataaaaat 

2387 cttacttgaattccaagag 

2388 gctgcatgtggctggtaac 

2389taaaagattactttgagaa 

2390ttggcaagtaagtgctagg 

2391atttacaattgttgctttt 

2392altacctatgalttctcct 

2393 atgtcaaacactttgttat 

2394 gatgaagatgacgactttt 

2395cacaagtGgattcccagca 

2396tgaggtgactcagagactc 

2397 gtcagigaagttctccagc 

2398cggagcatgggagtgaaat 

2399acggagcatgggagtgaaa 

2400aacggagcatgggagtgaa 

2401 agaagtgtcttcaaagctg 

2402catteaattgggagagaca 

2403ccaUcagtctctcaagac 

2404 cagataaaaaactcaccat 

2405gctgttttgaagactctcc 

2406 cagaattcataatcccaac 

2407 actgcaagatttttcagac 

2408caagaacctgttagttgct 

2409gcacatcaatattgatcaa 

241 Otttcagatggcattgctgc 

241 1 aaatcccatccaggttttc 

2412tcctttggctgtgclttgt 

24 1 3 agtgaagttctocagcaag 

2414ccagaattcataatcccaa 

2415ggctattgatgttagagtg 

24 1 6 ggcatgatgctcatttaaa 

24 1 7 taaagccattcagtctctc 

24 1 8 tttaaccagtcagatattg 

24 1 9 tttattgctgaatccaaaa 
2420ttttgagaggaatcgacaa 
2421 gggaaaaaacaggcttgaa 
2422ttctctctatgggaaaaaa 
2423caaaagaagcccaagaggt 
2424 acaaagcagattatgttga 
2425 itttcagaccaactctctg 
2426ctgtctctggtcagccagg 
2427attacacttcctttogagt 
2428tctcttcotccatggaatt 
2429 cttggagtgccagtttgaa 
2430agagctiatgggatttcct 
2431 ttttggcaagctatacagt 
2432 ctttgtgagtttatcagtc 
2433attggatatccaagatctg 
2434ttaaaagaaatcttcaatt 



7362 7381 
8014 8033 
1066610685 
5570 5589 
7267 7286 
9368 9387 
6263 6282 
1011910138 
7057 7076 
1215012169 
9079 S098 
7442 7461 
8588 8607 
8620 8639 
8619 8638 
8618 8637 
1240412423 
6493 6512 
1296712986 
1220512224 
1080 1099 
8266 8285 
1360413623 
1334313362 
6410 6429 
1160211621 
8029 8048 
9674 0693 
8591 8610 
8265 8284 
6980 6999 
9169 9188 
1296212981 
1017910198 
1364713666 
6350 6369 
9568 9587 
9558 9577 
1294012959 
1182111840 
1361413633 
7716 7735 
1286112880 
1047110490 
1180011819 
1115511174 
8372 8391 
9687 9706 
1925 1944 
1380713826 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



21 29tatgatttacatgatttga 6675 

21 30tttgaaaatagctattgct 6689 

2 1 31 ttgaaaatagctattgcta 6690 

21 32aatagctattgctaataft 6695 

21 33attattgatgaaatcattg 671 1 

21 34aaagtcttgatgagcacta 6739 

21 SSaagtcttgatgagcactat 6740 

21 36ttgatgagcactateataf 6745 

2 1 37 taattttagtaaaaacaat 6769 

21 36ttttagtaaaaacaatcca 6772 

2 1 39 acatttgtttattgaaaat 6797 

2 1 40 attgattttaacaaaagtg 681 6 

2141 attttaacaaaagtggaag 6820 
21 42aaatcagaatccagataca 6880 
2143gaatccagatacaagaaaa 5886 

2 1 44 ttaagagacacatacagaa 691 6 

2145 atccagcacctagctg gaa 6942 

2 1 46 tgagcatgtcaaacacttt 7052 
2 1 47gagcatgtcaaacacttfg 7053 
2148 aaacactttgttataaaf c 7062 
2 1 49tgagaaaatcaatgcx:ttc 71 03 
2 1 SOtatgaagtagaccaacaaa 71 52 
21 51 aagtagaccaacaaatcca 7156 
21 52aagttgaaggagactattc 721 5 
21 53acaagttaagataaaagat 7256 
21 54aagafaaaagattactttg 7263 
21 55gattactttgagaaattag 7272 
21 56tgagaaattagttggattt 7280 
21 57aaattagttggatttattg 7284 
21 58tggatttattgatgatgcl 7292 
21 59tcattgaagatgttaacaa 7345 

2 1 60 cattgaagatgttaacaaa 7346 

2161 attgaagatgttaacaaat 7347 
2 1 62ttgaagatgttaacaaatt 7346 
2163tgaagatgttaacaaanc 7349 
21 64acatgttgataaagaaatt 7372 
2 1 65 tUgattaccaccagtttg 7398 
2 1 66caaaatccgtgaggtgact 7433 
2167aaaatccgtgaggtgactc 7434 
21 66aggtgactcagagactcda 7444 
21 69gtgaaattcaggctctgga 7465 
21 70gttgcagtgtatctggaaa 7539 
2171 ttaagttcagcatctttgg 7608 
21 72tgaaggccaaaUccgaga ^ 7633 
2173aatgtatcaaatggacatt 7676 
2 1 74attcagcaggaacttcaac 7692 
21 75acctgtctctggtcagcca 7714 
2 1 76 cctgtctctggtcag ccag 771 5 
21 77ggtcagccaggtttatagc 7724 
2 1 76 ccaggtttatagcacactt 7730 



6694 SEQ 
6708 SEQ 
8709 SEQ 
6714SEQ 
6730 SEQ 
6758 SEQ 
6759 SEQ 
8764 SEQ 
6788 SEQ 
8791 SEQ 
5816 SEQ 
6835SEQ 
6839SEQ 
6899 SEQ 
6905 SEQ 
6935 SEQ 
6981 SEQ 
7071 SEQ 
7072 SEQ 
7081 SEQ 

7122SEQ 
7171 SEQ 
7176SEQ 
7234 SEQ 
7276 SEQ 
7282SEQ 
7291 SEQ 
7299 SEQ 
7303 SEQ 
7311 SEQ 
7364SEQ 
7366 SEQ 
7366 SEQ 
7367SEQ 
7368SEQ 
7391 SEQ 
7417 SEQ 
7452 SEQ 
7453 SEQ 
7463 SEQ 
7484 SEQ 
7558 SEQ 
7627 SEQ 
7652SEQ 
7695 SEQ 
7711 SEQ 

7733 SEQ 

7734 SEQ 
7743 SEQ 
774gSEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



2435tcaatgattatatcccata 

2436 agcacagaaaaaattcaaa 

2437tagcacagaaaaaattcaa 

2438 aataaatggagtctttatt 

2439caataccagaattcataat 

2440tagtgattacacttccttt 

2441 atagcaacactaaatactt 

2442atatccaagatgagatcaa 

2443 attgagattccctccatta 

2444tggagtgccagtttgaaaa 

2445 atttcctaaagctggatgt 

2446cactgttccagttgtcaat 

2447cttcaaagacttaaaaaat 

2448tgtaccataagccatattt 

2449 ttttctaaacttgaaattc 

24 5 0 ttcttaaa ca tt c ctttaa 

2451 ttccaatttccctgtggat 
2452aaagtgccacttttactca 
2453caaatgacatgatgggctc 
2454gattatateccatatgttt 
2455 gaaggeaaagcgcacctca 
2456tttgtggagggtagtcata 
2457tggatgaagatgacgactt 
2458gaataccaatgctgaactt 
2459atctaaattcagttcttgt 
2460caaaatagaagggaatctt 
2461 ctaaacttgaaattcaatc 
2462aaatccgtgaggtgactca 
2463caamtgagaatgaattt 
2464 agcatgcctagtttctcca 
2465ttgtagatgaaaccaatga 
2466tttgtagatgaaaccaatg 
2467atttaagtatgatttcaat 
2468 aatttaagtatgatttcaa 
2469gaatttaagtatgatttca 
2470aattccctgaagttgatgt 
2471 caaattgaacatecccaaa 
2472agtccccctaacagatttg 
2473 gagtgaaatgctgtttttt 
24741tgatgatatctggaaccl 
2475tccaatctcctcttttcac 
2476 tttcaagcaaatgcacaac 
2477 Gcaatgctgaactttttaa 
2478tctcctttcttcatcttca 
2479 aatgaagtccggattcatt 
2480gttgagaagccccaagaat 

2481 tggcaaglaagtgctaggt 

2482 ctggacttctcf agtcagg 

2483 gctaaaggagcagttgacc 
2484 aagtccggattcattctgg 



1312013139 1 3 

1385613875 1 3 

1385513874 1 3 

1407614095 1 3 

8260 6279 1 3 

1285612875 1 3 

8761 8780 1 3 

1309313112 1 3 

1169411713 1 3 

1180211821 1 3 

1116711186 1 3 

9863 9882 1 3 

8006 8025 1 3 

1008010099 1 3 

9057 9078 1 3 

9483 9502 1 3 

3680 3699 1 3 

6183 6202 1 3 

5326 5345 1 3 

1312613144 1 3 

1202112040 1 3 

1032310342 1 3 

1214812167 1 3 

1016010179 1 3 

1132611345 1 3 

2069 2088 1 3 

9061 0080 1 3 

7435 7454 1 3 

1041110430 1 3 

9945 9964 1 3 

7414 7433 1 3 

7413 7432 1 3 

1048710506 1 3 

1048610505 1 3 

1048510504 1 3 

1147911498 1 3 

8783 8602 1 3 

7964 7983 1 3 

8630 8649 1 3 

1072310742 1 3 

8401 8420 1 3 

8532 8551 1 3 

1016510184 1 3 

1020510224 1 3 

1101311032 1 3 

6246 6265 1 3 

9369 9388 1 3 

8802 8821 1 3 

1052710546 1 3 

1101711036 1 3 
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SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 



21 79gtttatagcacacttgtca 7734 

2 1 80 acttglcacctacatttct 7745 

21 81 ctgattggtggactcttgc 7762 
21 82atgaaagcattggtagagc 7839 
2183tgaaagcattggtagagca 7840 
2 1 84 gggttcactgttcctgaaa 7860 
21 BStcaagaccatccttgggac 7879 
21 86ccttgggaccatgcctgcc 7889 
21 87ttcaggctcttcagaaagc 7921 
21 88ttcagataaacttcaaaga 7996 
21 89acttcaaagacttaaaaaa 8005 
21 90afcccatccaggttttcca 8031 
2 1 91 gaatttaccatccltaaca 8055 
21 92cattccttcctttacaatt 8081 
21 93ttgaccagatgctgaacag 81 37 
2 1 94aatcaccctgccagacttc 8225 
21 95tgaccltcacataccagaa 831 2 

2 1 96 ttccagcttccccacatct 8331 

2 1 97 aagctatacagtattctga 8379 

2 1 98 attctgaaaatccaatctc 8391 
21 99tttcacattagatgcaaat 8414 
2200caaatgctgacatagggaa 8428 
2201 gagagtccaaattagaagt 8500 
2202agagtccaaattagaagtt 8501 
2203 tctcaattttgattttcaa 8519 
2204caattttgattttcaagca 8522 
2205aatgcacaactctcaaacc 8541 
2206agttctccagcaagtacot 8596 
2207agtacctgagaacggagca 8608 
2208tcaaacacagtggcaagti 8670 
2209acaatcagcttaccctgga 8743 
221 octggatagcaacactaaat 8757 
221 1 ctgacctgcgcaacgagat 8821 
2212agatgagggaacacatgaa 8921 
221 3tcaacttttctaaacttga 9052 
2214ttctaaacUgaaattcaa 9059 

221 5 gaaattcaatcacaagtcg 9069 

22 1 6 cactgtttggagaagggaa 9 1 33 
2217actgtltggagaagggaag 9134 
2218aattctcttttcttttcag 9213 
2219ttctttfcagcccagccat 9222 
2220tttgaaagttcgttttcca 9275 
2221 cagggaagatagacttcct 9304 
2222ataagtacaaccaaaattt 9397 
2223 acaacgagaacattatgga 9427 
2224 aggaataaatggagaagca 9455 
2225 agcaaatctggatttctta 9470 
2226 tcctttaacaattcctgaa 9494 
2227tttaacaattcctgaaatg 9497 
2228acacaataatcacaacicc 9526 



7753 SEQ 
7764 SEQ 
7781 SEQ 

7858SEQ 
7859SEQ 
7879 SEQ 
7898 SEQ 
7908 SEQ 
7940SEQ 
8015SEQ 
8024SEQ 
8050SEQ 
8074SEQ 
8100SEQ 
81 56 SEQ 
8244 SEQ 
8331 SEQ 
8350 SEQ 
8398SEQ 
8410SEQ 
8433 SEQ 
8447 SEQ 

851 9 SEQ 

8520 SEQ 
8538 SEQ 
8541 SEQ 
8560 SEQ 
8615SEQ 
8627SEQ 
8689SEQ 
8762 SEQ 
8776 SEQ 
8840SEQ 
8940 SEQ 
9071 SEQ 
9078 SEQ 
9088 SEQ 
9152SEQ 
9163SEQ 
9232 SEQ 
9241 SEQ 

9294SEQ 
9323 SEQ 
94 16 SEQ 
9446 SEQ 
9474 SEQ 
9489 SEQ 
9513SEQ 
951 6 SEQ 
9545 SEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
iD NO 
ID NO 
ID NO 
D NO 
DNO 
DNO 



2485tgacctgtccattcaaaac 

2486 agaaaaaggggattgaagt 

2487 gcaagttaaagaaaatcag 
2488gctcatctcctttcttcat 
2489tgctcatctcctttcttoa 
2490tttcaccatagaaggaccc 
2491 gtccccctaacagatttga 
2492ggcac%agggctcggaagg 
2493 gctfgaaggaattcttgaa 
2494tcttcataagttcaaigaa 
2495tttfaacaaaagtggaagt 
2496tggagaagcaaatctggat 
2497tgttgaagtgtctccattc 

2498 aattccaattttgagaatg 

2499 ctgttgaaagatttatcaa 
2500gaagttctcaattttgatt 
2501 ttcttctggaaaagggtca 
2502agattctcagatgagggaa 
2503tcagatggcattgctgctt 
2504gagataaccgtgcctgaat 
2505 attttgaaaaaaacagaaa 
2506ttccatcacaaatcctttg 
2507actttacttcccaactctc 
2508aactttacttcccaactct 
2509ttgattcccttttttgaga 
251 Otgctgaatccaaaagattg 
251 1 ggtttatcaaggggccatt 
251 2aggttccatcgtgcaaact 
251 3tgctccaggagaacttact 
2514aactctcaagtcaagttga 
251 Stccattctgaatatattgt 
25 1 6 attttctgaacttccccag 
251 7atctgatgaggaaactcag 
25 1 8 ttcatgtccctagaaatct 
251 9tcaaggataacgtgtttga 
2520ttgatgatgctgtcaagaa 
2521 cgacgaagaaaataatttc 
2522ttccagaaagcagccagtg 
2523 cttccccaaagagaccagt 
2524 ctgattactatgaaaaatt 
2525atggaaaagggaaagagaa 
2526tggaagtgtcagtggcaaa 
2527aggacctttcaaattcctg 
2528aaatcaggatctgagttat 
2529tccattctgaatatattgt 
2530tgctggaattgtcattcct 
2531 taagttctctgtacctgct 
2532ttcaaaacgagcttcagga 
2533catttgatttaagtgtaaa 
2534 ggagacagcatcttcgtgt 



1367313692 
1027510294 
1401814037 
1020010219 
1019910218 
8951 8970 
7965 7984 
1397013989 
9580 9599 
1317513194 
6821 6840 
9464 9463 
9881 9900 
1040810425 
1292412943 
8514 8533 
8876 8895 
8913 8932 
1160411623 
1154411563 
9730 9749 
9662 9681 
1340213421 
1340113420 
1152911548 
1365213671 
1245212471 
1138011399 
1377213791 
1341413433 
1337213391 
1269412713 
1225112270 
1003010049 
1261012629 
7300 7319 
1355813577 
1249812517 
2890 2909 
1363013649 
1348613505 
1037210391 
9840 9859 
1403014049 
1337213391 
1172611745 
1171111730 
1319813217 
9613 9632 
1120311222 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ iD NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



2229aagatttctctctatggga 

2230gaaaaaacaggcttgaagg 

2231 ttgaaggaattcttgaaaa 

2232tgaaggaattcttgaaaac 

2233 agctcagtataagaaaaac 

2234tcaaatcctttgacaggca 

2235 atgaaacaaaaattaagtt 

2236aattcctggatacactgtt 

2237tlccagttgtcaatgttga 

2236aagtgtctccattcaccat 

2239gtcagcatgcctagtttct 

2240ctgccatgggcaatattac 

2241 tgaataccaatgctgaact 

2242tattgttgctcatctectt 

2243tgltgctcatctccttlct 

2244tctgtcattgatgcactgc 

2245ccacagctctgtctcfgag 

2246atttgtggagggtagtcat 

2247atatggaagtgtcagtggc 

224dtsgaaataccaagtcaaaa 

2249aagtcaaaacctactgtct 

2250actgtctcttcctccalgg 

2251 cttcctccatggaatttaa 

2252attcttcaatgctgtactc 

2253ttgaccacaagcttagctt 

2254cctcacctcttacttttcc 

2255agctgcagggcacttccaa 

2256ttccadaattgatgatatc 

2257gagaacataca3gcaaagc 

2258 atggcaaaf gtcagctctt 

2259 tggcaaatgtcagctcttg 

2260ttgttcaggtccatgcaag 

2261 tgttcaggtccatgcaagt 

2262 agttccttccatgatttcc 
2263 tgctaacactaagaaccag 
2264 actaagaaccagaagatca 
2265ctaagaaccagaagatcag 
2266 cagaagatcagatggaaaa 
2267aaaaatgaagtccggattc 
2268 gattcattctgggtctttc 
2269 aagaaaaggcacaccttga 
2270 aaggacacclaaggttcct 
2271 ccagcattggtaggagaca 
2272ctttgtgtacaccaaaaac 
2273ccatccctgtaaaagtttt 
2274tgatctaaattcagttctt 
2275 aagaagctgagaacttcat 
2276 tttgccctcaacctaccaa 
2277cttgattcccmtttgag 
2278 ttcacgcttccaaaaagtg 



9653 9572SEQIDNO: 
9570 9689SEQ ID NO: 

9582 9601 SEQ ID NO: 

9583 9602SEQIDNO: 
9632 9651SEQIDNO: 
9712 9731SEQIDNO: 
9781 9800SEQIDNO: 
9851 9870SEQIDNO: 
9868 9887 SEQ ID NO: 
9886 9905SEQIDNO: 
9942 9961SEQIDNO: 

1010510124SEQIDNO: 
10169 10178SEQ ID NO: 
10193 10212SEQ ID NO: 
10196 10215SEQ ID NO: 
10224 10243SEQ ID NO: 
10297 1031 6s EQ ID NO: 
10322 10341 SEQ ID NO: 
10369 10388SEQ ID NO: 
1044510464SEQIDNO: 
10455 1 0474s EQ ID NO: 
10467 1048SSEQ ID NO: 
10474 10493SEQ ID NO: 
10504 10523SEQ ID NO: 
10540 10559SEQ ID NO: 
10565 10584SEQ ID NO: 

1070210721SEQIDNO: 
10715 10734SEQIDNO: 
1085210871SEQIDNO: 

10889 10908SEQ ID NO: 

10890 10909SEQ ID NO: 

10906 10925SEQ ID NO: 

10907 10926SEQ ID NO: 
10932 10951 SEQ ID NO: 
10979 10998SEQ ID NO: 

10986 11005SEQ ID NO: 

10987 11006SEQ ID NO: 
109951 101 4SEQ ID NO: 
11010 11029SEQ ID NO: 
11024 11043SEQ ID NO: 
11071 11090SEQ ID NO: 
11107 11126SEQ ID NO: 
11191 11210SEQ ID NO: 
11231 11250SEQ ID NO: 
11269 11288SEQ ID NO: 
1132411343SEQIDNO: 
1142411443SEQIDNO: 
11445 11464SEQ ID NO: 
11528 11547SEQ ID NO: 
11583 11602SEQ ID NO: 
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Z090 lyy occigwaccaaagcig 


1 1Q*59 1 1Q7 1 
1 09OZ 1 09 r 1 


1 

1 


1 


SEQ 


ID 


NO: 


22oo wiccaicigcgciaccaya 


i^nftft 




ID 


NO; 


Z99^ ic(y ciiaiacsivacyyoy 


1*^7011 '^799 
1 Of MO 10 f ZZ 


1 
1 


3 


SEQ 


ID 


NO: 


22o9aigag9a3acTcagaicaa 




l^rOSEQ 


ID 


NO: 


Zu90 iig agiigcccacca icai 


1 iR<^Q1 iR7R 


1 
1 


•a 
0 


SEQ 


ID 


NO: 


2290aggcagcuCi99Cugci 


•IOOQ9 


i^^nSEQ 


ID 


NO: 


zDoo ag caag i(/iucciggcc( 


OUlU OUZ9 


1 
1 


0 
0 


SEQ 


ID 


NO: 


2291 igaaagacaacytydjcaa 




l/OOOStQ 


ID 


NO 


ORQ7f frmnQnQn4^99n4tff^9 

Z09/ uyggayagacaayinca 


OwUU V9 l9 


1 
1 


3 


SEQ 


ID 


NO: 


229^taigauaigicaacaayi 




izoroSEQ 


ID 


NO 


Z990 BCluy CaClaignCalQ 


1 97c; *^ 19774 
IZ/ \39 lZ» 1 *T 


1 
1 


3 


SEQ 


ID 


NO 


2293canaggcaaaiigaigai 






ID 


NO 


ORQQ of^.49^9/«4^f/^^/%9o4n 

Z999 aicaacacaaicucaaig 


inri7iii9R 

1 0 1 U r 10 1 ZO 


1 


1 
0 


SEQ 


ID 


NO: 


2294^8Cicaggaaggccaag 


iZD/ D 




ID 


NO. 


ORnO r*ff/inf a/«nanfta/*4/*ao 

zouu cuy yiacgaguaCiCaa 


19R19 19ftm 


1 
1 


1 

0 


SEQ 


ID 


NO: 


2295y°°Q^tQ99^^°^^^°^^ 




1Z/47SEQ 


1 n 

ID 


NO: 


ZDU 1 agigauacaciicciuc 


1 9RR7 1 9R7R 


1 


•I 


SEQ 


ID 


NO 


A A AO^^nM^^A »^ j'^il »A A /*i#^AA A 

229Q^ccinc9 agiia a g gaaa 


1^009 


l^ooooEQ 


ID 


NO; 


zDuz ulCigcxTavigcicagga 


I'^CklR I^^IR 


1 


•1 


SEQ 


ID 


NO 


2297 gccaucagicicicaaga 


I4C9DD 


TzyoostQ 


1 r% 

ID 


NO 


zouo ccnccy ucig laaiggc 


(^7QA RR1 ^ 
Dr9*r OOlO 


1 


0 


SEQ 


iD 


NO 


229DgtgciucgiaQicuca9g 


1^990 


1 OUT fcOcQ 


ID 


NO 


00^^ AA4a A A AA A A A A ATAA/*I AA 

zowrt ccigcaccaaag viggcac 


HQRR11Q7R 


•1 
1 


1 


SEQ 


ID 


NO 


2299agcxgaaagagaiQaaaii 


loUDf 


1307bSEQ 


ID 


NO 


ZDUO aaiuaiicaaaacgagci 


1 1109 1 191 1 
1 n3 1 9Z 1 OZ 1 1 


1 


1 


SEQ 


ID 


NO 


2300 as tnacuaicuaiiaa 




1 0U9I SEQ 


1 

ID 


NO 


OROR ff 999a/1999t/«44/*99i( 

ZDUO Uaaaag aaalCllCaaU 


1 1Rn711ft9R 


1 


0 
0 


ocU 


lU 


NO 


1 uHaaauyuyaaayaa 






ID 


NO. 


ZDU/ HClClwlaiyyyaaaaaa 


QR<>R Qfi77 
9o90 99 r r 


1 


1 

0 


SEQ 


irv 

ID 


NO 


23U2iaaiciicaiaagucaai 




i ^4Q1 


ID 


NO 


ZDUO any By auccciccaiia 


11RQA1171^ 
1 lOvt 1 1 f 10 


1 


0 


SEQ 


ID 


NO 


2303atattttgatccaagtata 


13271 


13290SEQ 


ID 


NO 


2609tataagcagaagcacatat 


1392913948 


1 


3 


SEQ 


ID 


NO 


2304tgaaatattatgaacttga 


13303 


13322SEQ 


ID 


NO 


261 0 tcaaccttaatgattttca 


8287 8306 


1 


3 


SEQ 


ID 


NO 


2305caatttctgcacagaaata 


13434 


13453 SEQ 


ID 


NO 


2611 tattcttcttttccaattg 


1382613845 


1 


3 


SEQ 


ID 


NO 


2306a9aagQttgcagagcUtc 


13501 


13520 SEQ 


ID 


NO 


; 2612gaaatcttcaatttattct 


1381313832 


1 


3 


SEQ 


ID 


NO 


2307gaagaaaataatttctgat 


13562 


13581 SEQ 


ID 


NO 


261 3 atcagttcagataaacttc 


7991 8010 


1 


3 


SEQ 


ID 


NO 


2308ttgacctgtccattcaaaa 


13672 


13691 SEQ 


ID 


NO 


2614ttttgagaatgaatttcaa 


1041410433 


1 


3 


SEQ 


ID 


NO 


2309tcaaaactaccacacattt 


13685 


13704SEQ 


ID 


NO 


261 5 aaattocttgacatgttga 


7362 7381 


1 


3 


SEQ 


ID 


NO 


231 Ottttttaaaagaaatcttc 


13803 


13822 SEQ 


ID 


NO 


261 6 gaagtgtcagtggcaaaaa 


1037410393 


1 


3 


SEQ 


ID 


NO 


231 1 aggatctgagttattttgc 


14035 


14054 SEQ 


ID 


NO 


261 7 gcaagggttcactgttcct 


7856 7875 


1 


3 


SEQ 


ID 


NO 


231 2tttgctaaactt9ggggag 


14049 


14068 SEQ 


ID 


NO 


26 1 8 ctccccaggacctttcaaa 


9834 9853 


1 


3 
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Table 9. Selected palindromic sequences from human ApoB 







sourco 


Start End 
Index Index 




IVIBiCil 


Start End # 
Index Index 


n 

D 




SEQ ID NO" 


2619 


ggocattocagaagggaag 


617 


636 SEQ 


ID NO" 


3g48cttGcgttctgtaatggcc 


6803 


5822 


1 


9 


SEO ID NO 


2B20 


tgccatctcgagagttcca 


4107 


4126SEQ 


ID NO* 


3g49tggaactctctocatggca 


10884 


10903 


1 


8 


SPO in NO 


2621 


catgtcaaacactttgtta 


7064 


7083 SEQ 


ID NO' 


3950taacaaattccttgacatg 


7366 


7385 


1 


8 


SPO ID MO 


2822 


tttgttataaatcttattg 


7076 


7095SEQ 


ID NO' 


3951 caataagatcaatagcaaa 


8998 


9017 


1 


8 


SPO ID NO' 

111/ I^W< 




tctggaaaagggtcatgga 


8888 


8907 SEQ 


ID NO' 


3959tccatgtcccatttacaga 


11364 


11383 


1 


8 


SPO ID NO' 


2624 


cagctcttgttcaggtcca 


10908 10927 SEQ 


ID NO- 


3960tggacct9caccaaagctg 


13960 


13979 


1 


8 


^PO ID NO 


2B25 


ggaggttccccagctctgc 


364 


383SEQ 


ID No- 
ll./ IvWi 


3981 gcagccctgggaaaactcc 


6455 


6474 


1 


7 


*?PO ID wo 




ctgttttgaa g actctcca 


1089 


1108 SEQ 


ID NO 

IL/ 


3962tggagggtagtcataacag 


10335 


10354 


1 


7 


cpQ in NO 


2R27 


agtggctgaaacgtgtgca 


1305 


1324 SEQ 


ID NO 

IL/ liw 


3963 tgcagagctttctgccact 


13516 


13535 


1 


7 


SPO ID NO 


2628 


ccaaaatagaagggaatd 


2076 


2095SEQ 


ID NO 

IL/ ItIV 


3964 agattcctttgccttttgg 


4008 


4027 


1 


7 


ccn ID NO 


9fi9Q 


tgaagagaagattgaattt 


3620 3647 SEQ 


in NO 


3965aaattctcttttcttttca 


9220 


9239 


1 


7 


SPO ID NO< 


26j)n 


agtggtggcaacaocagca 


4238 


4267SEQ 


in NO 


3966tgctagtgaggccaacact 


10667 


10676 


1 


7 


SPO ID NO' 

OC>j| IL/ I^W 


£00 1 


aaggctccacaagteatca 


5958 


5977SEQ 


ID NO 


3967 tgatgatatctggaacctt 


10732 


10751 


1 


7 


<=;PO in NO 


9fi^2 


gtcagccaggtttatagca 


7733 


7752SEQ 


ID NO 

IL/ INL/ 


3968tgctaagaaccttactgac 


7789 


7808 


1 


7 


SPO ID NO 


^\/<j«^ 


tgatatctggaaccttgaa 


10735 10754SEQ 


ID NO 

1 L/ IM w 


396gttcactgttcctgaaatca 


7871 


7890 


1 


7 


SEQ ID NO 




gtcaagttgagcaatttct 


13431 13450 SEQ 


ID NO 


3970 agaaaaggcacaccttgac 


11080 


11099 


1 


7 


SPO ID NO 

WCW ILf IN^ 




atccagatggaaaagggaa 


13488 13507SEQ 


ID NO 


3971 ttccaatttccdgtggat 


3688 


3707 


1 


7 


ccn ID NO 

Iv I^W 


> 2636 


atttgtttgtcaaagaagt 


4551 


4570SEQ 


ID NO 

1^ lilW 


3972 acttcagagaaatacaaat 


11409 


11428 


4 


6 


SEQ ID NO 


2637 


ctggaaaatgtcagcctgg 


212 


231 SEQ 


ID NO 


3973 ccagacttccgtttaccag 


8243 


8262 


2 


6 


SEQ ID NO 


263B 


accaggaggttcttcttca 


1737 


1768SEQ 


ID NO 


3974tgaagtgtagtctcctggt 


5097 


5116 


2 


6 


SEQ ID NO 


2639 


aaagaagttctgaaagaat 


1964 


1983SEQ 


ID NO 


3975 attccatcacaaatccttt 


9669 


9688 


2 


6 


SEQ ID NO 


2640 


gctacagcttatggctcca 


3578 


3597 SEQ 


ID NO 


3976 tggatctaaatgcagtagc 


11631 


11650 


2 


6 


SEQ ID NO 


2641 


atcaatattgatcaatttg 


6422 


6441 SEQ 


ID NO 


3977 caaagaagtcaagattgat 


4561 


4580 


2 


6 


SEQ ID NO 


2642 


gaattatcttttaaaacat 


7334 


7353SEQ 


ID NO 


3978 atgtgttaacaaaatattc 


11502 


11521 


2 


6 


SEQ ID NO 


2643 


cgaggcccgcgdgctggc 


138 


167 SEQ 


ID NO 


3979gccagaagtgagatcctcg 


3515 


3534 


1 


6 


SEQ ID NO 


2644 


acaactatgaggctgagag 


279 


298SEQ 


ID NO 


3980ctctgagcaacaaatttgt 


10317 


10336 


1 


6 


SEQ ID NO 


2645 


gctgagagttccagtggag 


290 


309 SEQ 


ID NO 


3981 ctccatggcaaatgtcagc 


10893 


10912 


1 


6 


SEQ ID NO 


2648 


tgaagaaaaccaagaactc 


456 


475SEQ 


ID NO 


3982 gagtcattgaggttcttca 


4937 


4956 


1 


6 


SEQ ID NO 


2647 


cctacttacatcctgaaca 


666 


686sEQ 


ID NO 


3983tgttcalaagggaggtagg 


12774 


12793 


1 


6 


SEQ ID NO 


2646 


ctacttacatcctgaacat 


667 


586SEQ 


ID NO 


3984 atgtlcataagggaggtag 


12773 


12792 


1 


6 


SEQ ID NO 


264B 


gagacagaagaagocaagc 


623 




ID NO 

ill/ l^w 


3985gcttggttttgccagtctc 


2467 


2486 


1 


6 


SEQ ID NO 


265D 


cactcactttaccgtcaag 


679 


698 SEQ 


ID NO 


3986cttgaacacaaagtcagtg 


6008 


6027 


1 


6 


SEQ ID NO 


2S51 


ctoatcagcagcagccagt 


830 


849SEQ 


ID NO 

1 L/ I^W 


3S87actgggaagtgcttatcan 


5245 


5264 


1 


6 


SPO ID NO 

OI^W IL/ i^w 




actggacgctaagaggaag 


862 


881 SEQ 


in MO 

IL/ INW 


3988 cttcccceaagagaccagt 


2898 


2917 


1 


6 


SPO ID NO 

wE^Vm IL/ llV/ 


2663 


agaggaagcatgtggcaga 


873 


892SEQ 


ID NO 

IL/ INw 


3989tctggcatttactttctct 


5929 


5948 


1 


6 


SPO ID NO 


266d 


tgaagactctccaggaact 


1095 


1114SEQ 


ID NO 


3990 agttgaaggagactattca 


7224 


7243 


1 


6 


SPO ID NO 


26^6 
£□99 


ctctgagcaaaatatccag 


1129 


1148SEQ 


in NO 


3991 ctggttactgagctgagag 


1169 


1188 


1 


6 


SEQ ID NO 


: 2656 


afgaagcagtcacatctct 


1197 


1216SEQ 


ID NO 


3992agagc1gocagtccttcat 


10024 


10043 


1 


6 


SEQ ID NO 


2657 


ttgccacagctgattgagg 


1217 


1236SEQ 


ID NO 


3993cc1cctacagtggtggcaa 


4230 


4249 


1 


6 


SEQ ID NO 


: 2658 


agctgattgaggtgtccag 


1224 


1243SEQ 


ID NO 


39Q4ctggat1ocacatgcQgct 


11856 


11874 


1 


6 


SEQ ID NO 


2650 


tgctccactcacatcctcc 


1286 


1305 SEQ 


ID NO 


3995ggaggctttaagttcagca 


7609 


7628 


1 


6 


SEQ ID NO 


2660 


tgaaacgtgtgcalgccaa 


1311 


1330SEQ 


ID NO 


3996ttgggagagacaagtttca 


6508 


6527 


1 


6 


SEQ ID NO 


: 2661 


gacattgctaattacctga 


1611 


1530SEQ 


ID NO 


3997lGagaagctaagcaatgtc 


7240 


7269 


1 


6 


SEQ ID NO 


: 2662 


ttcttottcagactttcct 


1748 


1765 SEQ 


ID NO 


3998 aggagagtccaaattagaa 


8506 


8525 


1 


6 


SEQ ID NO 


: 2663 


ccaatatottgaactcaga 


1911 


1930 SEQ 


ID NO 


3999tctgaattcattcaattgg 


6493 


6512 


1 


6 


SEQ ID NO 


: 2664 


aaagttagtgaaagaagtt 


1954 


1973SEQ 


ID NO 


4000aact8Ooctcactgccttt 


2140 


2159 


1 


6 


SEQ ID NO 


2665 


aagttagtgaaagaagttc 


1955 


1974SEQ 


ID NO 


: 4001 gaacctctggcatttactt 


5924 


5943 


1 


6 


SEQ ID NO 




aaagaagttctgaaagaat 


1964 


1983SEQ 


ID NO 


4002attctctggtaactacttt 


5490 


5509 


1 


6 
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SEQ ID NO: 


2667 


tttggctataccaaagatg 


2330 2349SEQ 


ID NO: 


4003 catcttaggcactgacaaa 


5005 


5024 




6 


SEQ ID NO: 


2668 


tgttgagaagctgattaaa 


2389 2408SEQ 


ID NO: 


4004tttagccatcggctcaaca 


5708 


5727 




6 


SEQ ID NO: 


2669 


caggaagggctcaaagaat 


2569 2588SEQ 


ID NO: 


4006attectttaacaattcctg 


9500 


9519 




6 


SEQ ID NO: 


2670 


aggaagggctcaaagaatg 


2570 2589SEQ 


ID NO: 


4006cattcctttaacaaltcct 


9499 


9518 




6 


SEQ ID NO: 


2671 


gaagggctcaaagaatgac 


2572 2591 SEQ 


ID NO: 


4007 gtcagtcttcaggctcttc 


7922 


7941 




6 


SEQ ID NO- 


2672 


caaagaatgacttttttct 


2580 2599SEQ 


ID NO: 


4008agaaggatggcattttttg 


14008 


14027 




6 


SEQ ID NO 


2673 


catggagaatgcctttgaa 


2611 2630SEQ 


ID NO: 


4009ttcdgagocaaagtccatg 


7127 


7146 




6 


SEQ ID NO. 


2674 


ggagccaaggctggagtaa 


2687 2706SEQ 


ID NO: 


4010ttactccaacgocagctcc 


3058 


3077 




6 


SEQ ID NO: 


. 2675 


tcattccttocccaaagag 


2892 2911 SEQ 


ID NO: 


401 1 ctctctggggcatctatga 


5147 


5166 




6 


SEQ ID NO 


2676 


acctatgagctccagagag 


3173 3192SEQ 


ID NO: 


401 2ctcfcaagaocacagaggt 


12984 


13003 




6 


SEQ ID NO 


2677 


gggcaaaacgtcttacaga 


3373 3392 SEQ 


ID NO: 


401 3tdgaaagacaacgtgccc 


12325 


12344 




8 


SEQ ID NO, 


: 2678 


accctggacattcagaaca 


3395 3414SEQ 


ID NO: 


401 4tgttgctaaggttcagggt 


5683 


5702 




6 


SEQ ID NO 


: 2679 


atgggcgacctaagttgtg 


3437 3456 SEQ 


ID NO: 


401 Scacaaattagtttcaccat 


8949 


8968 


1 


6 


SEQ ID NO. 


; 2680 


gatgaagagaagattgaal 


3626 3645 SEQ 


ID NO: 


4016attccagcttGCOcacatc 


8338 


8357 


1 


6 


SEQ ID NO: 


; 2681 


caatgtagataccaaaaaa 


3664 3683 SEQ 


ID NO: 


401 7ttttttggaaatgccattg 


8651 


8670 


1 


6 


SEQ ID NO 


2682 


gtagataocaaaaaaatga 


3668 3687SEQ 


ID NO 


401 Stcatgtgatgggtctctac 


4379 


4398 


1 


6 


SEQ ID NO 


: 2683 


gcttcagttcatttggact 


4517 4536SEQ 


ID NO 


401 gagtcaagaaggacttaagc 


5312 


5331 


1 


6 


SEQ ID NO 


: 2684 


tttgtttgtcaaagaagtc 


4552 4571 SEQ 


ID NO 


4020gacttcagagaaafacaaa 


11408 


11427 


1 


6 


SEQ ID NO 


2685 


ttgtttgtcaaagaagtca 


4553 4572 SEQ 


ID NO 


4021 tgacttcagagaaatacaa 


11407 


11426 




6 


SEQ ID NO 


. 2686 


tggcaatgggaaactcgct 


5854 5873 SEQ 


ID NO 


4022agcgagaatcaccctgcca 


8227 


8246 


1 


6 


SEQ ID NO 


; 2687 


aacctctggcatttacttt 


5925 5944 SEQ 


ID NO 


4023aaaggagatgtcaagggtt 


10607 


10626 


1 


6 


SEQ ID NO 


2688 


catttacttlctctoatga 


5934 5953 SEQ 


ID NO 


4024tcatttgaaagaataaatg 


7034 


7053 




6 


SEQ ID NO 


. 2689 


aaagtcagtgccctgctta 


6017 6036SEQ 


ID NO 


4025taagaaccttactgacttt 


7792 


7811 




6 


SEQ ID NO 


2690 


tcccattttttgagacctt 


6330 6349 SEQ 


ID NO 


4026aaggacttcaggaatggga 


12012 


12031 




6 


SEQ ID NO 


2691 


catcaatattgatcaattt 


6421 6440SEQ 


ID NO 


4027 aaattaaaaagtcttgatg 


6740 


6759 




6 


SEQ ID NO 


; 2692 


taaagatagttatgattta 


6873 6692 SEQ 


ID NO 


4028 taaaccaaaacttggttta 


9027 


9046 




6 


SEQ ID NO 


: 2693 


tattgatgaaatcattgaa 


6721 6740 SEQ 


ID NO. 


4029ttcaaagacttaaaaaata 


8015 


8034 




6 


SEQ ID NO 


: 2694 


atgatctacatttgtttat 


6798 6817SEQ 


ID NO 


4030ataaQgaaattaaagtcat 


7388 


7407 




6 


SEQ ID NO 


2695 


agagacacatacagaatat 


6927 6946SEQ 


ID NO 


4031 atatattgtcagtgcctct 


13390 


13409 




6 


SEQ ID NO 


2698 


gacacatacagaatataga 


6930 6949 SEQ 


ID NO' 


4032tctaaattcagttcttgtc 


11335 


11354 




6 


SEQ ID NO 


: 2697 


agcatgtcaaacactttgt 


7062 7081 SEQ 


ID NO 


4033 acaaagtcagtgccctgct 


6015 


6034 




6 


SEQ ID NO 


: 2698 


tttttagaggaaaccaagg 


7523 7542SEQ 


ID NO 


4034cctttgtgtacaccaaaaa 


11238 


11257 




6 


SEQ ID NO 


: 2699 


ttttagaggaaaccaaggc 


7524 7643 SEQ 


ID NO 


4035 gcctttgtgtacaccaaaa 


11237 


11256 




6 


SEQ ID NO 


: 2700 


gg aagatagacttcctgaa 


9315 9334SEQ 


ID NO 


4036ttcagaaatBCtgttttcc 


12832 


12851 




6 


SEQ ID NO 


: 2701 


cactgtttctgagtcccag 


9342 9361 SEQ 


ID NO 


4037ctgggacctaccaagagtg 


12531 


12550 




6 


SEQ ID NO 


2702 


cacaaatcctttggctgtg 


9676 9695 SEQ 


ID NO 


4038cacatlteaaggaattgtg 


10071 


10090 


1 


6 


SEQ ID NO 


: 2703 


ttcctggatacactgttcc 


9861 9880 SEQ 


ID NO 


4039ggaac1gttgactcaggaa 


12577 


12596 


1 


6 


SEQ ID NO 


: 2704 


gaaatctcaagctttctct 


10050 10069 SEQ 


ID NO 


4040agagccaggtcgagctttc 


11052 


11071 


1 


6 


SEQ ID NO 


; 2706 


tttcttcatcttcatctgt 


10218 10237 SEQ 


ID NO 


4041 acagctgaaagagatgaaa 


13063 


13082 


1 


8 


SEQ ID NO 


: 2706 


tctacx^gctaaaggagcag 


10529 10548 SEQ 


ID NO 


4042ctgcacgctttgaggtaga 


11769 


11788 




6 


SEQ ID NO 


2707 


ctaccgctaaaggagcagl 


10530 10549 SEQ 


ID NO 


4043actgcaogctttgaggtag 


11768 


11787 




6 


SEQ ID NO 


: 2706 


agggcclcUUlcaccaa 


10839 10858 SEQ 


ID NO 


4044ttggccaggaagtggccct 


10965 


10984 




6 


SEQ ID NO 


2709 


ttctccatccctgtaaaag 


11273 11 292 SEQ 


ID NO 


4045ctttttcaccaacggagaa 


10846 


10865 




6 


SEQ ID NO 


: 2710 


gaaaaacaaagcagattat 


11824 11 843 SEQ 


ID NO 


4046 ataaactgcaagatttttc 


13608 


13627 




6 


SEQ ID NO 


: 2711 


actcactcattgattttct 


1 2690 1 2709SEQ 


ID NO 


. 4047dgaaaatcaggatctgagt 


14035 


14054 




6 


SEQ ID NO 


; 2712 


taaactaatagatgtaatc 


128981 291 7sEQ 


ID NO 


4048 gattaccaccagcagttta 


13586 


13605 




6 


SEQ ID NO 


: 2713 


caaaaogagcttcaggaag 


13208 13227SEQ 


ID NO 


4049 cttcgtgaagaatattttg 


13268 


13287 




6 


SEQ ID NO 


2714 


tggaataatgctcagtgtt 


2374 2393SEQ 


ID NO 


4050aacacttacttgaattcca 


10670 


10689 


3 


5 


SEQ ID NO 


2715 


gatUgaaatccaaagaag 


2408 2427 SEQ 


ID NO 


4051 cttcagagaaatacaaatc 


11410 


11420 


3 


5 


SEQ ID NO 


: 2716 


atttgaaatccaaagaagt 


2409 2428 SEQ 


ID NO 


4052 acttcagagaaatacaaat 


11409 


11428 


3 


5 
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9717 

1 r 


atcaacagccgcttctttg 


998 


1017SEQ 


in MA 


4053 caaagaagtcaagattgat 


4561 


4560 


2 


5 


S5FO ID NO- 


?7in 

£,1 1 0 


tgttttgaagactctccag 


1090 


1109SEQ 


in MO 


4054ctggaaagttaaa8caaca 


6963 


6982 


2 


5 


SEO ID NO* 




cccttctgatagatgtggt 


1332 


1351 SEQ 


in MO 


* 4055accaaagctggcaccaggg 


13969 


13988 


2 


5 


SEO ID NO* 




tgagcaagtgaagaacttt 


1876 


1895SEQ 


in MO 


4056 aaagccattcagtototoa 


12971 


12990 


2 


5 


SEO ID NO* 


2791 


atttgaaatccaaagaagt 


2409 


242BSEQ 


in Mn 


4057 acttttctaaacttgaaat 


9063 


9082 


2 


5 


SEO ID NO- 


9799 


atccaaagaagtcccggaa 


2416 2435SEQ 


in Mn 

ILi IMW 


4058ttccflflaflaaaccteflsat 


12729 


12748 


2 


6 


SEO ID NO' 


979 


agagcctacctocgcatct 


2438 


2457SEQ 


in Mn 


4059 aaatggtacqttaacctct 


11929 


11948 


2 


5 


RFO in NO* 


9794 


aatgcctttgaactcccca 


2818 2637SEQ 


in MA 

IU IMU 


4060 ta QQ aactacaatttcatt 


7020 


7039 


2 


5 


SEO ID NO' 

W^\« IL/ IMw* 


279*5 


gaagtccaaattccggatt 


3305 


3324SEQ 


in MA 

lU IMU 


4061 aatcttcaatttattcttc 


13823 


13842 


2 


5 


OC14 IL/ rMU. 




tgcaagcagaagocagaag 


3504 


3523gEQ 


lU NU 


4062 cttcaggttccatcgtgca 


11384 


11403 


2 


5 


bcvil ID InU: 


2727 


nasaaaaaaattaaattta 


3629 


3648SEQ 


ID NO 




104R7 




9 


R 


OCU lU IMU. 






4605 


4824SEQ 


ID NO 


^na^ pr*s)ts(f n n a sirttps a n/^nt 






O 


\J 






tcoctcacctccacctcta 


4745 


4764SEQ 


ID NO 




R090 


AQ3fl 


o 


6 

V/ 


ccr^ in Kir^> 




atftananntntnanaaort 


5435 


5454SEQ 


ID NO 


• ^nRflnpttttntaaar^lriaaat 
tvuvauuiiuiaticsiuiiyocicil 


QOR*^ 


0089 


4. 


R 
9 


OCU IU IMUI 


2731 




5602 


5621 SEQ 


ID NO 


^ n ^%7 n a t n ttn 9 a a f^nffv*4 
• '*\J\jf allcliguSclaavagiCCI 


1 lOvO 


1 1 Af\7 
1 109/ 


o 


e 
w 


otU IU IMU. 


27o2 


aaaactfl aaacacatcaat 


6409 


6428 SEQ 


ID NO 


* T vu o fill ii^|wiwa iwlvUtii 


10909 


10991 




e 

w 


obU ID iNU: 


2733 


cfactaoaaacaacoanaa 


9426 


9445SEQ 


ID NO 


*r\A90 llbiyaUaubaCCa^lCeig 


1 090^ 


lOOU 1 


A 


c 
O 




2734 


ttaaaaaaaftfrflnanafl 


9590 


eeoosEQ 


ID NO 


«tu r V iiiinancayaDEiivllviaci 




IOOi3& 




R 


CCA in KlOi* 
ocU lU INU. 


27oo 


□aaataaaaoaaaattttci 


10751 10770SEQ 


ID NO 




104ft7 


104ftA 


O 
e. 


o 


ceo in MO* 


z/oo 


taaaaaanataacaaattl 


11992 12011SEQ 


ID NO 


4079 gaatntrmrif^fntinif fa 
\ *twr ^aaaiymeiyuii/uyimci 




10Q91 


d. 




ccn tn Kin* 




aaaatctaaattattttac 


14043 14062SEQ 


ID NO 






10047 




c 
w 


OCU IU IML/. 




atacccttcteaottacta 


26 


45SEQ 


ID NO 


4074 Rannf^ftn nnstnanmf* 


R74A 


^7R7 


1 


IS 

9 


CCA in MA* 

ocu IU nxJ, 


^738 




154 


173SEQ 


ID NO 


4ft7fi ftflcinlfv?apangpffnfif*f* 


?070 


0U09 


1 

t 


e 


OCVJ IU INU. 




ctocactactactactact 


1^ 


181 SEQ 


ID NO 


407fi a acafir^aatitn rn a a n can 


3232 


39<i1 






CCA in Ma- 
ocU lU JNU. 






178 


197SEQ 


ID NO 


4077orinnattff!Qi?fltfif*anf* 


1 lost 


11873 
1 i o r o 


•1 
1 


e 
o 


CPA in MA- 




aaaaaaaaatoctaaaaaa 


201 


220SEQ 


ID NO 


4078 tttttcttcactacatcf t 




9fl1 1 

£,\J 1 1 


1 


«; 


OCA in MA. 




cto Q aa a atotcaaocta Q 


212 


231 SEQ 


lU NU 


407fl ccaoacHccacatcccaa 


3023 


3942 


1 
1 


g 


CCA in MA* 
OCU ttJ WH\J. 




iQaaatccctadoactQct 


304 


323SEQ 


IU NU 


4080 aocatacctaatttdiica 


9953 


9072 


1 
1 




ocu IU i\Ui 




aaaQtccctaaaactacta 


305 


324SEQ 


lU NU. 


408 1 caaeatQcctaatttelcQ 


9952 


0071 


A 
1 


g 


CCA in UA* 


27Zft 


taaaactactaattcaaaa 


313 


332SEQ 


IU NU 


4082 tcttcxatcBcttaaccca 


2050 


2080 


1 

1 


5 


ccn in KiA* 

ocu IL/ IMU. 




ctactaattcaaaaaatac 


318 


337sEQ 


IU NU. 


4083ac!:acaccttnar!attciRan 


1 1087 


11106 
III 


1 
1 




QCA in MA- 
OCU IU IMU. 




to ccaccaaaatcaactac 


334 


353SEQ 


ir\ Ki^. 

ID NO: 


4084 n r!a nn rttnn a ntnnln ncn 


979*; 


77 AA 

^ f *t*T 


1 
1 


(; 


CCA in MA' 
OCU IU INU. 




□ccacca □Qatcaactoca 

y y ywnv***i viy^rfd 


335 


354SEQ 


ID NO; 




9794 


9743 


1 
1 


C 
D 


CCA m KIA« 
OCU IU IMU. 


27O0 




350 


389sEQ 


ID NO; 




47#?9 
If (J^ 


4771 
Hi f 1 


J 


B 


CCA in MA* 
ocU IU INU. 




caaaattaaactaaaaott 

*'wwjjyMy»y*»*jjjjMjjyftft 


352 


371 SEQ 


m kf A. 

ID NO: 


40AQ aaftcn^la^af tfiaannHn 
*r vo V a a wwUvlfelwislijjcici Vlijj 


1 wr wv 


43789 


1 


e 
O 


CCA in MA* 

ocu IU ivu. 




cfCxCicaocfTcaTCctci9 

wftwt|jywi|f wkiwaiwiya 


377 


396SEQ 


m km. 

ID NO: 


40Q0 tnann a a t fr*lr*Bsin an 




lO&OO 


1 


c 
9 


CCA in KIA' 

ocU IU MU, 


^7oo 


caacttcatcctnasascc 


382 


401 SEQ 


ID NO: 


4001 nn^f*^nonHxkootnf*tn 

fvo I ggicugciguBaatgcig 


WOO 


OUlrt 


1 


e 
0 


CCA in MA« 
OCU IU i\U. 




□cttcatcctaaaaaccaa 


384 


403SEQ 


ID NO: 






OO^t 


i 

1 


c 


SEQ ID NO: 


2755 


tcatcctgaagaccagcca 


387 


408SEQ 


ID NO: 


4093tggcatggcattatgatga 


3612 


3631 


1 


5 


SEQ ID NO: 


2756 


gaaaaccaagaactetgag 


460 


479SEQ 


ID NO: 


4094 ctcaaccttaatgattttc 


8294 


8313 


1 


5 


SEQ ID NO: 


2767 


agaactctgaggagtttgc 


468 


487QEQ 


ID NO: 


4095 gcaagctatacagtattct 


8385 


8404 


1 


6 


SEQ ID NO: 


2758 


tctgaggagtttgctgcag 


473 


492SEQ 


ID NO: 


4096 ctgcaggggatcccccaga 


2534 


2553 


1 


5 


SEQ ID NO: 


2759 


tttgctgcagccatgtcca 


482 


601 SEQ 


ID NO: 


4097tggaagtgtcagtggcaaa 


10380 


10399 


1 


5 


SEQ ID NO: 


2760 


caagaggggcatcatttct 


586 


805SEQ 


DNO: 


4096 agaataaatgacgttcttg 


7043 


7062 


1 


5 


SEQ ID NO: 


2761 


tcactttaccgtcaagacg 


682 


701 SEQ 


DNO: 


4099cgtctacactatcatgtga 


4368 


4387 


1 


5 


SEQ ID NO: 


2762 


tttaccgtcaagacgagga 


686 


705SEQ 1 


ID NO: 


4100 tccttgacatgttgataaa 


7374 


7393 


1 


5 


SEQ ID NO: 


2763 


cactggacgctaagaggaa 


861 


880SEQ 1 


ID NO: 


4101 ttccagaaagcagccagtg 


12506 


12525 


1 


5 


SEQ ID NO: 


2764 


aggaagcatgtggcagaag 


875 


894SEQ 1 


ID NO: 


41 02cttcatacacattaatcct 


9996 


10015 


1 


5 


SEQ ID NO: 


2765 


caaggagcaacacctcttc 


901 


920SEQ 


DNO: 


41 03gaagtagtactgcatcttg 


6843 


6862 


1 


5 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



27QQ acagactttgaaacttgaa 

2767 tgatgaagcagtcacatck 

2768 agcagtoacatctctcttg 

2769 ccagccccatcactttaca 

2770 ctccactcacatcctccag 

2771 catgccaacccccttctga 

2772 gagagatcttcaacatggc 

2773 tcaacatggogagggatca 

2774 ccaccttgtatgcgctgag 

2775 gtcaacaactatcataaga 

2776 tggacatfgctaattacct 

2777 ggacattgctaattacctg 

2778 ttctgcgggtcattggaaa 

2779 ccagaactcaagtcttcaa 

2780 agtcttcaatoctgaaatg 

2781 tgagcaagtgaagaacttt 

2782 agcaagtgaagaactttgl 

2783 tctgaaagaatctcaactt 

2784 actgtcatggacttcagaa 

2785 acttgaoccagoctcagcc 

2786 tccaaataactaccttcct 

2787 actaocctcactgcctttg 

2788 Itggatttgcttcagctga 

2789 ttggaagdclttttggga 

2790 ggaagctctttttgggaag 

2791 tttttcccagacagtgtca 

2792 agacagtgtcaacaaagct 

2793 ctttggctataccaaagat 
2704 caaagatgataaacatgag 

2795 gatatggtaaatggaataa 

2796 ggaataatgctcagtgttg 

2797 tttgaaatccaaagaagic 

2798 gatcocccagatgattgga 

2799 cagatgattggagaggtca 

2800 agaatgacttttttcttca 

2801 gaactccccactggagctg 

2802 atatcttcatctggagtca 

2803 gtcattgdcccggagcca 

2804 gctgaagtttatcattoct 

2805 attocttccccaaagagac 

2806 ctcattgagaacaggcagt 

2807 ttgagcagtattctgtcag 
2608 accttgtccagtgaagtcc 
2809 ccagtgaagtccaeattcc 
2610 acattcagaacaagaaaat 

2811 gaaaaatcaagggtgttat 

2812 aaatcaagggtgttatttc 

2813 tggcattatgatgaagaga 

2814 aagagaagattgaatttga 

2815 aaatgacttccaatttccc 



957 986SEQ ID NO 
1195 1214SEQIDNO 
1201 1220SEQIDNO 
1239 1258SEQIDNO 
1288 1307SEQIDNO 
1322 1341 SEQ ID NO 
1398 1417SEQIDNO 
1407 1428SEQIDNO 
1437 1456SEQIDNO 
1463 1482SEQIDNO 

1509 152BSEQ1DNO: 

1510 1529SEQIDNO 
1581 1600SEQIDNO; 
1628 1647SEQIDNO 
1638 1657SEQIDNO; 
1876 1895SEQIDNO: 
1878 1897SEQIDNO: 
1972 1991sEQIDNa 
1994 2013SEQIDNO 
2059 2078SEQIDNO; 
2104 2123SEQIDNO; 
2141 2160sEQlDNa 
2157 2176sEQIDNa 
2219 2238sEQIDNa 
2221 2240SEQIDNO: 
2246 2265SEQIDNO: 
2254 2273SEQIDNO: 
2329 2348SEQIDNO: 
2341 236DSEQIDN0; 
2363 2382SEQIDNO: 
2375 2394SEQIDNO: 
2410 2429SEQIDNO: 
2542 2561 SEQ ID NO: 
2549 256BSEQIDNO: 
2583 2a02sEQ|DNO: 
2627 2646SEQIDNO: 
2650 2679SEQIDNO: 
2675 2894SEQIDNO: 
2881 2900SEQIDNO: 
2894 2913SEQIDN0: 
2984 3003SEQIDNO: 
3160 3169SEQIDNO: 
3293 3312SEQIDN0: 
3300 3319SEQIDNO: 
3402 3421 SEQ ID NO: 
3471 3490SEQIDNO: 
3474 3493SEQIDNO: 
3617 3636SEQIDNO: 
3630 3849SEQIDNO: 
3681 3700SEQIDNO: 



41 04ttcaattcttcaatgctgt 
41 05 agatttgaggattccatca 
4106caaggagaaactgactgct 
4107tgtagtctcctggtgctgg 
41 oactggagcttagtaatggag 
41 09tcagatgagggaacacatg 

4110 gccaccctggaactctctc 

41 1 1 tgatcccacctctcattga 

4112 ctcegggatctgaaggtgg 
41 1 Stcttgagttaaatgctgac 
41 14aggtatattcgaaagtcca 
41 1 Scaggtatattcgaaagtcc 
41 16tttcacatgccaaggagaa 
41 1 7ttgaagtgtagtctcctgg 

4118 catttctgattggtggact 

4119 aaagtgccacttttactca 
41 20acaaagtcagtgccctgct 
4121 aagtccataatggttcaga 
4122ttctgaatatattgtcagt 

4 1 23 ggctcacoctgagagaagt 

41 24 aggaagatatgaagatgga 
4125caaatttgtggagggtagt 
41 26tcagtataagtacaaccaa 
41 27tcccgattcacgcttccaa 
41 28 cttcagaaagctaccttcc 
41 29tgaccttctctaagcaaaa 
41 30agcttgottttgocagtcl 

4131 atctegtgtcteggaaaag 

41 32 ctcaaggataacgtgtttg 
4133ttatcttattaattatatc 
41 34 caacacttacttgaattcc 
41 35gacttcagagaaatacaaa 
41 36tccaatttccctgt9gato 
41 37tgaccacacaaacagtctg 
41 38tgaagtccggattcattct 
41 39 cagctcaaocgtacagttc 
41 40tgacttcagtgcagaatat 

4141 tggcccGgtttaccatgac 

41 42 aggaggctttaagttcagc 

41 43 gtctcttcctccatggaat 
4144actgactgcacgctttgag 
4145Gtgagagaagtgtcttcaa 
41 46ggacggtactgtcccaggt 
4147ggaaggcagagtttactgg 

41 48 atttcctaaagctggatgt 

41 49 ataaactgcaagatttttc 
41 SOgaaacaatgcattagattt 
4151 tctcccgtgtataatgcca 
41 52tcaaaacctactgtctc(t 
41 53gggaactacaatttcattt 



10508 


10527 


1 5 


7984 


8003 ' 


1 5 


6532 


6551 


1 5 


5102 


6121 


1 5 


8717 


8736 


1 5 


8927 


6946 


1 5 


10877 


10896 


1 5 


2973 


2992 


1 5 


8195 


8214 


1 5 


4987 


5006 ' 


1 5 


12807 


12826 


1 5 


12806 


12825 • 


1 5 


6522 


6541 


1 5 


5096 


5115 ' 


1 5 


7765 


7784 ' 


1 5 


6191 


6210 • 


1 5 


6015 


6034 ' 


1 5 


12819 


12838 ' 


1 5 


13384 


13403 


1 5 


12399 


12418 ' 


\ 5 


4720 


4739 • 


1 5 


10327 


10346 ' 


1 5 


9400 


9419 ' 


1 5 


11585 


11604 


1 5 


7937 


7956 ' 


1 5 


4884 


4903 ' 


1 5 


2466 


2485 ' 


1 5 


5976 


5995 ' 


1 5 


12617 


12636 ' 


1 5 


13087 


13106 ' 


1 5 


10669 


10688 ' 


1 5 


11408 


11427 ' 


1 5 


3689 


3708 ' 


1 5 


5371 


5390 1 


1 5 


11023 


11042 1 


1 5 


11869 


11888 1 


I 5 


11974 


11993 1 


1 5 


5817 


5636 1 


I 5 


7608 


7627 1 


1 5 


10478 


10497 1 


1 6 


11764 


11783 1 


5 


12407 


12426 1 


5 


12792 


12811 1 


5 


9156 


9175 1 


5 


11175 


11194 1 


5 


13608 


13627 1 


5 


9753 


9772 1 


5 


11789 


11808 1 


5 


10465 


10485 1 


5 


7021 


7040 1 


5 
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SEQ ID NO: 


2818 


atgacttccaatttccctg 


3683 3702SEQIDNO 


41 54 caggctgattacgagtcat 


4925 


4944 


1 5 


SEQ ID NO: 


2817 


acttccaatttccctgtgg 


3686 


3705SEQIDNO 


41 55 ccacgaaaaatatggaagt 


10368 


10387 


1 5 


SEQ ID NO: 


2818 


agttgcaatgagdcatgg 


3811 


3830SEQIDNO 


: 4156ccatcagttcagataaact 


7997 


8016 


1 5 


SEQ ID NO: 


2619 


tttgcaag accacctcaat 


3868 


3887 SEQ ID NO 


41 67 attgacctgtccattcaaa 


13679 


13696 


1 5 


SEQ ID NO: 


2820 


gaaggagitcaacctccag 


3892 


391 1 SEQ ID NO 


41 58 ctggaattgtcattccttc 


11736 


11755 


1 5 


SEQ ID NO: 


2821 


acttccacatcccagaaaa 


3927 3946SEQIDNO 


41 SOttttaacaaaagtggaagt 


6829 


6648 


1 5 


SEQ ID NO: 


2822 


ctcttcttaaaaagcgatg 


3947 3966SEQIDNO 


41 60catcactgccaaaggagag 


8494 


8513 


1 5 


SEQ ID NO: 


2623 


aaaagcgatggccgggtca 


3956 


3976SEQ ID NO 


41 61 tgactcactcattgatttt 


12686 


12707 


1 5 


SEQ ID NO: 


2824 


ttcctttgccttttggtgg 


4011 


4030 SEQ ID NO 


41 62ccacaaacaatgaagggaa 

• W W w w 


9264 


9283 


1 5 


SEQ ID NO: 


2625 


caagtctgtgggattccat 


4087 


4106SEQIDNO 


41 63 atgggaaaaaacaggcttg 


9574 


9593 


1 5 


SEQ ID NO: 


2826 


aagtccctactlttaccat 


4125 4144SEQIDNO 


4 1 64 atgggaagtataagaactt 


4842 


4861 


1 5 


SEQ ID NO* 


2827 


tgcctctoctgggtgitct 


4167 


4186ccn in Mn 


4166 agaaaaacaaacacaggca 


9651 


9670 


1 5 


SEQ ID NO* 


2828 


accagcacagaocatttca 


4250 


4269 SPO in NO 


41 66 tqaaQtgtaqtctcctagt 


6007 


6116 


1 6 


SEQ ID NO' 


2829 


ccagcacagaccatttcag 


4251 


4270 SEQ ID NO 


41 67 ctaaaataca atp ctctaa 


5519 


5538 

w w 


1 5 

1 w 


SEQ ID NO- 


2830 


actatcatgtgatgggtct 


4375 


4394SEQ 10 NO 


41 68 aaacacctaattttataat 


7956 


7975 

1 w 1 w 


1 5 

1 w 


SEQ ID NO- 


2831 


accacagatgtctgcttca 


4504 


4523SEQ ID NO 


41 69 taaaQQctaactctotaat 


4290 


4309 


1 5 

ff w 


SEQ ID NO* 


2B32 


ccacagatgtctgcttcag 


4505 


4524SEQ ID NO 


41 70 ctaaflcaacaaatttotaa 


10319 

■ WW 9 W 


10338 

1 WwWW 


t 5 

1 w 


SEQ ID NO' 


2833 


tttggactocaaaaagaaa 


4526 


4647SEQ ID NO 


41 71 tttetctcataattacaaa 


5941 


5960 


1 5 

■ w 


SEQ ID NO- 


2834 


teaaagaagtcaagattga 


4560 


4579SEQIDNO 


41 72tcaaaQataacatatttaa 


12618 

P ^ w P W 


12637 


1 6 

1 w 


SEQ ID NO- 


2835 


atgagaactacgagctgac 


4806 


4825SEQ ID NO 


41 73 atcaoatattattactcat 


10195 

P W 1 WW 


10214 

1 w*« I ~ 


1 5 


SEQ ID NO- 


2836 


ttaaaatctgacaocaatg 


4826 


4845SEQIDNO 


41 74 cattcattgaagatgttaa 


7360 


7369 


1 5 

P w 


SEQ ID NO* 


2837 


gaagtataagaactttgcc 


4646 


4865SEQIDNO 


41 75ggcaaatttgaagciacttc 


12002 


12021 


1 5 


SEQ ID NO' 


2838 


aagtataagaactttgcca 


4847 


4866SEQ ID NO 


41 76taaGaaatttaaaaaactt 


12001 


12020 


t 5 

1 w 


SEQ ID NO* 


2839 


ttcttcagcctgctttctg 


4949 


4968SEQ ID NO 


41 77 cagaatocaaatecaaQaa 


6692 

www^ 


6911 

w^# ■ ■ 


1 5 


SEQ ID NO* 


2840 


ctggatcactaaattccca 


4965 


4984SEQ ID NO 


41 78tgaqtctttocaaagccaa 


11041 


11060 ' 


! 6 


SEQ ID NO' 


2841 


aaattaatagtggtgctca 


5022 


5041 SEQ ID NO 


41 79 taaaaaQocccaaaaattt 


6256 


6275 

Wfc 9 w 


1 5 

1 w 


SEQ ID NO- 


2842 


agtgcaacgaccaacttga 


5081 


5100SEQIDNO. 


41 80 tcaaattcctaaatacact 


9866 


0875 

w W f W 


1 6 

1 w 


SEQ ID NO* 


2843 


ctgggaagtgcttatcagg 


5246 


6265 SEQ ID NO. 


4181 cctaaccttcacataccaa 


8318 

WW 1 w 


8337 

w^#W • 


1 5 

1 w 


SEQ ID NO' 


2844 


gcaaaaacattttcaactt 


5286 


6305SEQ ID NO: 


41 82 aagtaaaagaaaattttgc 


10752 


10771 ' 

9 V 9 W 9 


1 5 


SEQ ID NO: 


2845 


aaaaacattttcaacttca 


5268 


5307SEQ ID NO: 


41 63tgaagtaaaagaaaatttt 


10760 


10769 1 


1 5 


SEQ ID NO' 


2848 


tcagtcaagaaggacttaa 


5310 


6329SEQ ID NO; 


41 84 ttaaggacttccattctga 


13371 


13390 1 


1 5 


SEQ ID NO' 


2847 


tcaaatgacatgatgggct 

0 W WWW 


5333 


5352 SEQ ID NO: 


41 85 agcccatcaatatcattaa 


6213 

9^m^ 1 


6232 1 


1 5 


SEQ ID NO' 


2348 


cacacaaacagtctgaaca 


5376 


5394SEQIDNO: 


4186 tgtttcaactqcctttatq 


11227 


11246 1 

W 9 0m 9 %^ 


1 5 


SEQ ID NO' 


2849 


tcttcaaaacttgacaaca 


5417 


5436 SEQ ID NO: 


41 67tattttcclatttccaaaa 


12843 


12862 1 


I 5 


SEQ ID NO' 


2850 


caagttttataagcaaact 


5449 


5468SEQ ID NO: 


41 88 agttattttaotaaactta 


14051 


14070 1 

1 • w • w 


1 5 

w 


SEQ ID NO' 


2851 


tggtaactactttaaacag 


6496 5515SEQIDNO: 


41 89ctatttttaaaaaaaacca 

■ » mm mm mm mm wm %m 9m Wlw>^< 


7520 


7539 1 

m W%#v 


5 

w 


SEQ ID NO- 


2852 


aacagtgacctgaaataca 


6510 6529SEQIDNO: 


41 QOtotataacaaattcctatt 


6898 


S917 1 


5 


SEQ ID NO' 


2853 


gggaaactacggctagaac 


5552 


5571 SEQ ID NO: 


4191 qttccttccatoatttccc 


10941 

f WW^^ P 


10960 1 

1 WVWW 


5 

w 


SEQ ID NO: 


2854 


aacacatctatgccatctc 


5628 


6647SEQ ID NO: 


41 92 gagacagcatcttcgtgtt 


11212 


11231 1 


5 


SEQ ID NO: 


2855 


tcagcaagctataaagcag 


5660 


5679SEQ ID NO: 


41 93ctgctaagaaccttactga 


7788 


7807 1 


5 


SEQ ID NO: 


2856 


gcagacactgttgctaagg 


6675 


5694SEQ ID NO: 


41 94cctttcaagcactgactgc 


11764 


11773 1 


5 


SEQ ID NO: 


2857 


tctggggagaacatactgg 


6674 5893SEQIDNO: 


4195ccaggttttccacaccaga 


8046 


8065 1 


5 


SEQ ID NO: 


2858 


ttctctcatgattacaaag 


5942 


5061 SEQ ID NO: 


41 96 cttttteaccaacggagaa 


10846 


10865 1 


5 


SEQ ID NO: 


2859 


ctgagcagacaggcacctg 


6042 


60eisEQIDNO: 


4197caggaggctttaagttcag 


7607 


7626 1 


5 


SEQ ID NO: 


2860 


caatttaacaacaatgaat 


6074 


6093SEQ ID NO: 


41 98atlccttcctttacaattg 


8090 


8109 1 


5 


SEQ ID NO: 


2861 


tggacgaactctggcfgac 


6148 


6ie7sEQ ID NO: 


41 99 gtcagcccagttccttcca 


10932 


10951 1 


5 


SEQ ID NO: 


2862 


cttttactcagtgagccca 


6200 


6219SEQIDNO: 


4200tgggctaaacgtatgaaag 


7835 


7854 1 


5 


SEQ ID NO: 


2663 


tcattgatgctttagagai 


6225 6244SEQIDNO: 


4201 atcttcataagttcaatga 


13182 


13201 1 


5 


SEQ ID NO: 


2864 


aaaaccaagatgttcactc 


6303 


6322SEQ ID NO: 


4202gagtgaaatgctgtttttt 


8638 


8657 1 


5 
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SEQ ID NO 


2865 


a9933tC9acaaaccatta 


OODO 


6384 SEQ 


ID NO 


A ^ AO lj II 1-»-J l--l.--.lt I j».r»t 

4203 taatgattttcaagucct 


8302 


O 4 O 4 

8321 


1 t7 


SEQ ID NO 


; 2866 


tagttgtactggaaaacgt 


ooo4 


6403 SEQ 


ID NO 


4204 acgttagcctctaagacta 


11936 


11955 


1 0 


SEQ ID NO 


: 2867 


ggaaaac9iaca9^^^39 


OoS4 


6413SEQ 


ID NO 


^ one iililli 11 n n II 11 n nil n ri 

4205 ctuiacaancattucc 


13022 


13041 


4 IS 


SEQ ID NO 


: 2868 


gaaaacgtacagagaaagc 




8414sEQ 


ID NO 


4Z0D gctRctcnccacatttc 


10060 


10079 


1 0 


SEQ ID NO 


2869 


aaagctgaagcacatcaat 


6409 


6428SEQ 


ID NO 


4207 attgatgttagagtgcttt 


6992 


7011 


1 5 


SEQ ID NO 


: 2870 


aagctgaagcacatcaata 


6410 


6429 SEQ 


ID NO 


M 0^ 0^ f% ■..tJ. At* J. kA. 

4208 tattgatgttagagtgctt 


6991 


Jk ^ Sa 

7010 


A 

\ 5 


SEQ ID NO 


: 2871 


tgaagcacatcaatattga 


8414 


6433SEQ 


ID NO 


4209tcaaccttaatgamtca 


8295 


8314 


1 5 


SEQ ID NO 


; 2872 


atcaatattgatcaatttg 


8422 


6441 SEQ 


ID NO 


42 1 0 caaagccatcactgatgat 


1668 


1687 


1 5 


SEQ ID NO 


: 2873 


A. A AA A A AA 

taatgattatctgaattca 


f% A A A 

6484 


6503SEQ 


ID NO 


421 1 tgaaatcattgaaaaatta 


6727 


6746 


t 5 


SEQ ID NO 


: 2874 


gattatctgaattcattca 


6488 


8507SEQ 


ID NO 


421 2tgaagtagctgagaaaatc 


7102 


7121 


1 5 


SEQ ID NO 


2875 


aattgggagagacaagttt 


6506 


6525SEQ 


ID NO 


421 3aaacattcctttaacaatt 


9496 


9515 


1 5 


SEQ ID NO 


: 2876 


aaaatagctattgctaata 


6701 


6720SEQ 


ID NO 


42 1 4tattgaaaatattgatm 


6814 


6833 


1 5 


SEQ ID NO 


2877 


AA i. Ai. A 

aaaattaaaaagtcttgat 


6739 


6750 SEQ 


ID NO 


42 1 5 atcatatccgtgtaatttt 


6765 


6784 


1 5 


SEQ ID NO 


: 2878 


ttgaaaatattgattttaa 


6816 


6835 SEQ 


ID NO 


42 1 6 ttaatcttcataagttcaa 


13179 


13198 


1 5 


SEQ ID NO 


: 2878 


■ A A 

agacatccagcacctagct 


6946 


6965SEQ 


ID NO 


42 1 7agGttggttttgccagtct 


2466 


2485 


1 5 


SEQ ID NO 


: 2880 


—..—.^AMA AMA a. 

caatttcatttgaaagaat 


7029 


704BSEQ 


ID NO 


421 8attcdtcctttaGaattg 


8090 


8109 


1 5 


SEQ ID NO 


: 2881 


aggttttaatggataaatt 


^ ^A.m^ 

7182 


7201 SEQ 


ID NO 


42 1 9aattgttgaaagaaaacct 


13155 


13174 


1 5 


SEQ ID NO 


: 2882 


cagaagctaagcaatgloc 


7241 


7260SEQ 


ID NO 


4220 ggacaaggcccagaatctg 


12553 


12572 


1 5 


SEQ ID NO 


: 2883 


taagataaaagattacttt 


7270 


72B9SEQ 


ID NO 


4221 aaagaaaacctatgcctta 


13163 


13182 


1 5 


SEQ ID NO 


: 2884 


aaagattactttgagaaat 


7277 


7296SEQ 


ID NO 


4222atttcttaaacattccttt 


9489 


9508 


1 5 


SEQ ID NO 


: 2885 


A A A. A tAk 

gagaaattagttggattta 


7289 


7308 SEQ 


ID NO 


4223taaagccattcagtctctc 


12970 


12989 


1 5 


SEQ ID NO 


: 2886 


atttattgatgatgctgtc 


7303 


7322SEQ 


ID NO 


4224gacatgttgataaagaaat 


7379 


7398 


t 5 


SEQ ID NO 


: 2887 


a a * aaAa. a. 

gaattatcttttaaaacat 


7334 


7353SEQ 


ID NO 


4225atgtatcaaatggacattc 


7685 


7704 


1 5 


SEQ ID NO 


2888 


* * _ ^ AAA A A 

ttaocaccagtttgtagal 


7411 


7430SEQ 


ID NO. 


^ A A da a 

4226atctggaaccttgaagtaa 


10739 


10758 


1 5 


SEQ ID NO 


; 2889 


ttgcagtgtatctggaaag 


7548 


7567SEQ 


ID NO. 


A 0% 0^ tail a ■ a 

4227 cttttcacattagatgcaa 


8420 


8439 


1 5 


SEQ ID NO 


; 28Q0 


cattcagcaggaacttcaa 


7699 


7718SEQ 


ID NO: 


A #\ a a A a « 

4228ttgaaggacttcaggaatg 


12009 


12028 


1 5 


SEQ ID NO 


: 2891 


A AAA ft 1 1 

acacctgattttatagtcc 


7968 


7977SEQ 


ID NO: 


4229ggactcaaggataacgtgt 


12614 


12633 


1 5 


SEQ ID NO 


: 2892 


ggattccatcagttcagat 


7992 


8011 SEQ 


ID NO: 


^ ^4 a_ aa • • * 

4230atcttc&atgattatatcc 


13124 


13143 


1 5 


SEQ ID NO 


2893 


ttgtagaaatgaaagtaaa 


6112 


6131 SEQ 


ID NO' 


AA^^^A AaA a am a a 

4231 tttatgattatgtcaacaa 


12360 


12379 


1 5 


SEQ ID NO 


: 2894 


ctgaacagtgagctgcagt 


8156 


8175SEQ 


ID NO 


^^WKM A aa MM. a 

4232actggacttctctagtcag 


8809 


8828 


1 5 


SEQ ID NO 


' 2896 


aatccaatctoctcttttc 


8407 


8426SEQ 


ID NO. 


A a. A. AA. 

4233gaaaaatgaagtccggatt 


11017 


11038 


1 5 


SEQ ID NO 


: 2896 


attttgattttcaagcaaa 


8532 


8651 SEQ 


ID NO. 


A A% A a a ■ a a m 

4234tttgcaagttaaagaaaat 


14023 


14042 


1 5 


SEQ ID NO 


2897 


AA±A . A-AXA ^ _ A 

ttttgattttcaagcaaat 


8533 


8552SEQ 


ID NO. 


A A^ #^ V aaa aaa i 

4235atttgatttaagtgtaaaa 


9622 


9841 


1 5 


SEQ ID NO 


2898 


tgattttcaagcaaatgca 


8536 


8555SEQ 


ID NO: 


4236tgcaagttaaagaaaatca 


14025 


14044 


1 5 


SEQ ID NO 


2899 


atgctgttttttggaaatg 


8645 


8664SEQ 


ID NO: 


4237cattggtag9agacagcat 


11203 


11222 ' 


1 5 


SEQ ID NO 


2900 


tgctgttttttggaaatgc 


8646 


8665SEQ 


ID NO: 


4238gcattggtaggagacagca 


11202 


11221 ' 


1 5 


SEQ ID NO. 


2901 


aaaaaaatacactggagct 


8706 


8725SEQ 


ID NO: 


4239 agctagagggcctcttttt 


10833 


10852 


1 5 


SEQ ID NO: 


2902 


actggagcttagtaatgga 


8716 


8735SEQ 


ID NO: 


4240 tccactcacatcctccagt 


1289 


1308 ' 


1 5 


SEQ ID NO: 


2903 


cttctggaaaagggtcatg 


8886 


8905SEQ 


ID NO: 


4241 catgaacccctacatgaag 


13759 


13778 ' 


1 5 


SEQ ID NO: 


2904 


ggaaaagggtcatggaaat 


8891 


8910SEQ 


ID NO: 


4242aUtgaaagttcgttttcc 


9282 


9301 ' 


1 5 


SEQ ID NO: 


2905 


gggcctgcoccagattctc 


8910 


8920SEQ 


ID NO: 


4243gagaacattatggaggccc 


9440 


9459 1 


1 5 


SEQ ID NO: 


2906 


ttctcagatgagggaacac 


8924 


8943SEQ 


ID NO: 


4244 gtgtcttcaaagctgagaa 


12416 


12435 ' 


1 5 


SEQ ID NO: 


2907 


gatgagggaacacatgaal 


8930 


8949SEQ 


ID NO: 


4245aUGcagcttocccacatc 


8338 


8357 1 


1 S 


SEQ ID NO: 


2906 


ctttggactgtccaataag 


8986 


OOOSsEQ 


ID NO: 


4246 cttatgggatttcdaaag 


11167 


11186 1 


1 6 


SEQ ID NO: 


2909 


gcatccacaaacaatgaag 


9260 


9279SEQ 


ID NO: 


4247 cttcatctgtcattgatgc 


10227 


10248 1 


1 5 


SEQ ID NO: 


2910 


cacaaacaatgaagggaat 


9265 


9284SEQ 


ID NO: 


4248 attccctgaagttgatgtg 


11468 


11507 1 


1 5 


SEQ ID NO: 


2911 


ccaaaatttctctgctgga 


9415 


9434SEQ 


ID NO: 


4248tccatcacaaatoclttgg 


9671 


9690 1 


1 5 


SEQ ID NO: 


2912 


caaaatttctctgctggaa 


9416 


9436SEQ 


ID NO: 


4260ttocatcacaaatcctttg 


9670 


9689 1 


1 6 


SEQ ID NO: 


2913 


tctgctggaaacaacgaga 


9425 


W44SEQ 


ID NO: 


4251 tctcaagagttacagcaga 


13229 


13248 1 


1 5 



292 



wo 2004/080406 



PCTAJS2004/007070 



SEQ ID NO: 


2914 


ctgctggaaacaacgag aa 


9426 


9445 SEQ 


ID NO 


4252ttctcaagagttacagcag 


13228 


13247 


1 


5 


SEQ ID NO: 


2915 


agaacattatggaggcx;ca 


9441 


940OSEQ 


ID NO 


; 4253tgggcctgccccagattct 


8909 


8928 


1 


5 


SEQ ID NO* 


2916 


agaagcaaatctggatttc 


9476 


9494SEQ 


ID NO 


4254 gaaatcttcaatttattct 


13821 


13840 


1 


5 


SEQ ID NO' 


2017 


tttctctctatgggaaaaa 


9565 


S584SEQ 




; 4255tttttgcaagttaaagaaa 


14021 


14040 


1 


5 


SEQ ID NO' 


2918 


tcagagcatcaaatccttt 


9712 


9731 SEQ 


in Mn 


4256 aaaaaaaatcaQQatctga 


14033 


14052 


1 


5 


SEQ ID NO' 


2919 


cagaaacaatgcattagat 


9751 


e770sEQ 


in NO 


4257atctal90catctcttctg 


5633 

www 


6652 


1 


5 


SEQ ID NO* 


2920 


tacacattaatcctgccat 


10001 10020SPO 


in MO 


4258 atgqaatctttattgtota 


14089 


14108 


1 


5 


SFQ ID NO* 




agtcagatattgttgctca 


10194 10213SEQ 


in wo 


4259tciaQaactacaaQctaact 


4807 


4826 


1 

1 


5 


SFQ ID NO' 




99Q999tao|tcataacagt 


10336 10355 SEQ 


in NO 


4260 actQOitaacaaaaccctcc 


2734 


2753 


1 

1 


5 


SEQ ID NO* 


2923 


caaaagccgaaattccaat 


10404 10423 SEQ 


in NO 


4261 attaaaatacctactttta 


8366 


8385 


1 


5 


SEQ ID NO' 


2924 


aaaagcogaaattccaatt 


10405 10424 SEQ 


in NO 


4262 aattaaaatacctactttt 


6365 


8364 


1 


5 

w 


SEQ ID NO' 


2926 


ttcaagcaagaacttaatg 


10436 10455 SEQ 


in NO 

\\J Vi\J 


4263cattatgcK3ccttoataaa 


13258 


13277 


1 


5 

w 


SEQ ID NO' 


792R 


cctcttacttttccattga 


10578 10597 SEQ 


in wn 


4264tcaffiiaaaaaoccaaaaa a 


12947 


12966 


1 


5 


SEQ ID NO- 


2927 


tgaggocaacacttacttg 


106631 0682 opo 


in MO 


4265 caacjcatctciattQactca 


12676 


12695 

1 AaWWW 


1 

1 


5 

w 


SEQ ID NO" 


292B 


cacttacttgaattccaag 


1067210691 SEQ 


ID MO 


• 4266cttgaacacaaagtcagtg 


6008 


6027 


1 


5 


SEQ ID NO* 


2929 


gaagtaaaagaaaattttg 


10751 10770 SEQ 


in NO 


4267 caaaaacattttcaacttc 


5287 


5306 


1 


5 


SEQ ID NO* 


2930 


cctggaactctctacatgg 


1088210901 SEQ 


in NO 


4268 ccatttacaaatcttcagg 


11372 


11391 


1 


5 


SEQ ID NO' 


2931 


agctggatgtaaccaccag 


11184 11203SEQ 


in NO 


4269 ctggattocacatgcagct 


11855 


11874 


1 


5 


SEQ ID NO: 


2932 


aaaattcoctgaagttgat 


11485 11 504SEQ 


ID NO 


4270atcata1ocgtgtaatttt 


6765 


6784 


1 


5 


SEQ ID NO: 


2933 


cagatggcattgctgcttt 


1161311632SEQ 


in NO 


4271 aaagclgagaagaaatctg 




12443 


1 

9 


5 


SEQ ID NO: 


2934 


agatggcattgctgctttg 


11614 11633SEQ 


ID NO 


4272caaagctgagaagaaatct 


12423 


12442 


1 


5 


SEQ ID NO: 


2935 


tgttgaaacagtcctggat 


11842 11861 SEQ 


ID NO 


4273 atccaagatgagatcaaca 


13103 


13122 


1 


5 


SEQ ID NO: 


2936 


catattcaaa actg agttg 


12229 12248SEQ 


ID NO 


4274 caactctctgattactatg 


13631 


13650 


1 


5 


SEQ ID NO: 


2937 


aaagatttatcaaaagaag 


12938 12957SEQ 


ID NO 


4275cttcaatttattcttcttt 


13826 


13846 


1 


5 


SEQ ID NO: 


293B 


attttccaactaatagaag 


13034 13053SEQ 


ID NO 


4276 cttcaaagadtaaaaaat 


8014 


8033 


1 


5 


SEQ ID NO: 


2939 


aattatatccaagatgaga 


13097 131 16SEQ 


ID NO 


4277tctcttcctccatggaatt 


10470 


10498 


1 


5 


SEQ ID NO" 


2940 


Itcaggaagcttctcaaga 


13218 13237SEQ 


ID NO 


4278 tcttcataagltcaatgaa 


13183 


13202 


1 


5 


SEQ ID NO: 


2941 


ttgagcaatttctgcacag 


13437 13456 SEQ 


ID NO 


4279 ctgttgaaagattiatcaa 


12032 


12061 


1 


5 


SEQ ID NO: 


2942 


ctgatatacatcacggagt 


13712 13731 SEQ 


ID NO 


4280 actcaatggtgaaattcag 


7465 


7484 


1 


5 


SEQ ID NO: 


2943 


acatcacggagttactgaa 


13719 13738 SEQ 


ID NO 


4281 ftcagaagctaagcaatgt 


7239 


7258 


1 


5 


SEQ ID NO: 


2944 


actgcctatattgataaaa 


13882 13901 SEQ 


ID NO* 


4282ftttg9caagctatacagt 


8380 


8399 


1 


5 


SEQ ID NO: 


2945 


aggatggcalUUtgcaa 


14011 14030SEQ 


m NO' 


4283ttgcaagcaaatctttcd 


3013 


3032 


1 

« 


5 

w 


SEQ ID NO: 


2945 


ftttttgcaagttaaagaa 


14020 14039 SEQ 


ID NO' 


4284ttctctctatgggaaaaaa 


9566 


9585 


1 


5 


SEQ ID NO: 


2947 


tccagaactcaagtcttca 


1627 


1646 SEQ 


ID NO< 


4285tgaaatgctgttttttgga 


8641 


8660 


3 

w 


4 


SEQ ID NO* 


2943 


agttagtgaaagaagttct 


1956 


1975SEQ 


in NO" 


4286agaatctatac^aQaact 


12364 


12583 


3 


4 


SEQ ID NO' 


2949 


atttacagctctgacaagt 


5435 


5464 SEQ 


in NO' 


4287 aciteagagaaatacaaat 


11409 


11428 


3 


4 


SEQ ID NO: 


2950 


gattatctgaattcattca 


6488 


e607sEQ 


ID NO" 


4288fgaaaocaatgacaaaatc 


7429 


7448 


3 


4 


SEQ ID NO: 


2951 


gtgcccttctcggttgctg 


26 


45 SEQ 


in NO- 


4289 cagctgagcagacagacac 


6039 


6058 

1#WW w 


2 

mm 


4 


SEQ ID NO* 


2952 


attcaagcacctccggaag 


253 


272 SEQ 


ID MO- 
IL^ iMVi/i 


4290 cttcataagttcaatgaat 


13184 


13203 


2 


4 


SEQ ID NO: 


2953 


gactgctgattcaagaagt 


316 


335 SEQ 


ID NO: 


4291 acttcccaactctcaagtc 


13415 


13434 


2 


4 


SEQ ID NO: 


2954 


ttgctgcagccatgtccag 


483 


502 SEQ 


ID NO: 


4292 ctgggcagctgtatagcaa 


5889 


5908 


2 


4 


SEQ ID NO: 


2965 


agaaagatgaacctactta 


655 


574 SEQ 


ID NO; 


4293taagtatgatttcaattct 


10498 


10517 


2 


4 


SEQ ID NO: 


2956 


tgaagactotccaggaact 


1095 


1114SEQ 


ID NO: 


4294agttcaatgaatttattca 


13191 


13210 


2 


4 


SEQ ID NO: 


2957 


atctctcttgccacagctg 


1210 


1229SEQ 


ID NO: 


• 

4295 cagcccagocetttgagat 


9237 


9256 


2 


4 


SEQ ID NO: 


2953 


tctdcttgccacagctga 


1211 


1230 SEQ 


ID NO: 


4296tcagcccagccatttgaga 


9236 


9255 


2 


4 


SEQ ID NO: 


2959 


tgaggtgtccagccccatc 


1231 


1260SEQ 


ID NO: 


4297 gatgggaaagcx;gccctca 


6216 


5235 


2 


4 


SEQ ID NO: 


2960 


ccagaactcaagtcttcaa 


1628 


1647 SEQ 


ID NO: 


4298ttgaaagcagaacctctgg 


6915 


5934 


2 


4 


SEQ ID NO: 


2961 


ctgaaaaagttagtgaaag 


1949 


1968 SEQ 


ID NO: 


4299 ctttctcgggaatattcag 


1G631 


10650 


2 


4 


SEQ ID NO: 


2962 


tttttcccagacagtgtca 


2246 2265 SEQ 


ID NO: 


4300tgacaggcattttgaaaaa 


9730 


9749 


2 


4 


6EQ ID NO: 


2963 


ttttcocagacagtgtcaa 


2247 


2266 SEQ 


ID NO: 


4301 ttgaoaggcattttgaaaa 


9729 


9748 


2 


4 
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SEQ ID NO: 


2864 


cattoagaacaagaaaatt 


3403 3422SEQ 


ID NO: 


4302aattocaattttgagaatg 


10414 


10433 


2 


4 


SEQ ID NO: 


2965 


tgaagagaagaftgaattt 


3628 3647SEQ 


ID NO: 


4303eeatgtcagctcttgttca 


109D2 


10921 


2 


4 


SEQ ID NO: 


2966 


tttgaatggaacacaggca 


3644 


3663SEQ 


ID NO: 


4304tgccagtttgaaaaacaaa 


11815 


11834 


2 


4 


SEQ (D NO: 


2967 


ttctagattcgaatatcaa 


4407 4426SEQ 


ID NO: 


4306ttgacatgttgataaagaa 


7377 


7396 


2 


4 


SEQ ID NO: 


2068 


gattcgaatetcaaattcQ 


4412 


4431 3EQ 


ID NO: 


4306tgaaotagaccaacaaatc 


7162 


7181 


2 


4 


SEQ ID NO: 


2969 


tgcaacgacx:aacttgaag 


5083 


5102SEQ 


ID NO: 


4307cttcaggttccatcgtgca 


11384 


11403 


2 


4 


SEQ ID NO: 


2970 


ttaagctctcaaatgacat 


5325 5344SEC3 


ID NO: 


4308atgttgataaagaaattaa 


7382 


7401 


2 


4 


SEQ ID NO: 


2971 


caatttaacaacaatgaat 


6074 e093sEQ 


ID NO: 


4309attcaaactgcctatattg 


13876 


13895 


2 


4 


SEQ ID NO: 


2972 




6088 


6107SEQ 


ID NO: 


«fo lucQagagcacacggicnca 


TUbo7 


107UD 




4 


SEQ ID NO: 


2973 


<*a tr»a at ottn stiTA siift 


6421 


6440 SEQ 


ID NO: 


'rO 1 1 anauCCCl^aayuySig 


1 l*toD 


nouo 




A 
*t 


ScQ lu NO: 


2974 


QU wCI IM IwvlClvlva^il 


7059 


7078 SEQ 


ID NO: 


HO liiadgidogiyciayyiicaci 


oOO 1 


□Ann 


n 


A 






tgaaqqaqactattcaaaa 


7227 


7246 SEQ 


in Mn* 

IL» vAKJ. 


431 SttctacacaQaaatattca 


13446 


13465 


2 


A 


8E0 ID NO* 


287B 


ttcaggctcttcagaaagc 


7929 


7048SEQ 


in MH' 


43 1 4 acttgctaacctctcto aa 


12312 


12331 


2 


4 


SEQ ID NO* 


9Q7'/ 


tccacaaattgaacatooc 


8787 8806SEQ 


in wn* 


431 5ggaacctaocaaaacitaaa 


12533 


12552 


2 


4 

• 


SEQ ID NO* 


2978 


tgaataocaatgctgaact 


10167 10186SEQ 


in NO* 


431 6aattcaatgaatttattca 


13191 


13210 


2 


4 


SEQ ID NO- 


2979 


taaactaatagatgtaatc 


12898 1291 7 SEQ 


in Nin* 


431 7gattactatgaaaaattta 


13640 


13659 


2 


4 


SEQ ID NO- 


2980 


ttgacdgtccattcaaaa 


13680 13690 SEQ 


in MO' 


4318 ttttaaaagaaatcttcaa 


13813 


13832 


2 


4 


SEQ ID NO: 


2981 


gggctgagtgcccttctcg 


19 


SEQ 


ID NO: 


431 9 cgaggccaggccacaaccc 


84 


103 


1 

1 


4 


SEQ ID NO: 


. 2962 


ggcigagigcccucicg g 




^^SEQ 


ID NO: 


4320ccgaggccaggccgcagcc 


83 


102 


1 


A 


SEQ ID NO: 


2983 


ctgagtgcccttctcggtt 


22 


41 SEQ 


ID NO: 


4321 aaccotgcctgaatctcag 


11657 


11676 


1 


4 


SEQ ID NO: 


2984 


tctcggttgctgccgctga 


33 


52 SEQ 


ID NO: 


4322tcagctgacctcatcgaga 


2168 


2187 


1 


4 


SEQ ID NO: 


2985 


caggccgcagcccaggagc 


90 


lO^SEQ 


ID NO: 


4323gctctgcagcttcatcctg 


376 


395 


1 


4 


SEQ ID NO: 


2986 


gctggcgctgcctgcgctg 


151 


170SEQ 


ID NO: 


4324 cagcacagaccatttcagc 


4252 


4271 


1 


4 


SEQ ID NO: 


2987 


tgctgctggcgggcgccag 


177 


196SEQ 


ID NO: 


4325 ctggatgtaaccaccagca 


11186 


11205 


1 


4 


SEQ ID NO: 


2988 


ctggtctgtccaaaagatg 


227 


246 SEQ 


ID NO: 


4326catcctgaagaocagccag 


388 


407 


1 


4 


SEQ ID NO: 


2989 


ctgagagttccagtggagt 


291 


310SEQ 


ID NO: 


4327actcaccctggacattcag 


3391 


3410 


1 


4 


SEQ ID NO: 


2990 


tccagtggagtccctggga 


299 


31 8 SEQ 


ID NO: 


4328tcccggagccaaggctgga 


2683 


2702 


1 


4 


SEQ ID NO: 


2991 


aggugagctggaggttcc 


354 


373 SEQ 


ID NO: 


4329ggaaccctctocctcacct 


4736 


4755 


1 


4 


SEQ ID NO: 


2992 


tgagctggaggttccccag 


358 


377SEQ 


ID NO: 


4330 ctgggaggcatgatgctca 


9171 


9190 


1 


4 


SEQ ID NO: 


2993 


tctgcagcttcatcctgaa 


378 


397 SEQ 


ID NO: 


4331 ttcaaatataatcggcaga 


3269 


3288 


1 


4 


SEQ ID NO: 


2994 


gccagtgcaccctgaaaga 


402 


421 SEQ 


ID NO: 


4332tcttccgttctgtaatggc 


5802 


5821 


1 


4 


SEQ ID NO: 


2995 


ctctgaggagtttgctgca 


472 


491 SEQ 


ID NO: 


4333tgcaagaatattttgagag 


6346 


6367 


1 


4 


SEQ ID NO: 


2996 


aggtalgagctcaagctgg 


600 


519SEQ 


ID NO: 


4334ccagtt(ocggggeaacct 


12724 


12743 


1 


4 


SEQ ID NO: 


2997 


tcctttacccggagaaaga 


543 


562 SEQ 


ID NO: 


4335 tctttttgggaagcaagga 


2227 


2246 


1 


4 


SEQ ID NO: 


2998 


catcaagaggggcatcatt 


583 


602SEQ 


ID NO: 


4336 aatggtcaagttcctgatg 


2285 


2304 


1 


4 


SEQ ID NO: 


2999 


tcctggttcccccagagac 


609 


628 SEQ 


ID NO: 


4337 gtctctgaactcagaagga 


13996 


14015 


1 


4 


SEQ ID NO: 


3000 


aagaagccaagcaagtgtt 


630 


649SEQ 


ID NO: 


4338 aacaaataaatggagtctt 


14080 ' 


14099 


1 


4 


SEQ ID NO: 


3001 


aagcaagtgttgtttctgg 


638 


857SEQ 


ID NO: 


4339ccagagccaggfcgagctt 


11060 


11069 


1 


4 


SEQ ID NO: 


3002 


tdggataocgtgtatgga 


652 


671 SEQ 


ID NO: 


4340tccatgtcGcatttacaga 


11364 


11383 


1 


4 


SEQ ID NO: 


3003 


ccactcactttaccgtcaa 


678 


607SEQ 


ID NO: 


4341 ttgattttaacaaaagtgg 


6825 


6644 


1 


4 


SEQ ID NO: 


3004 


aggaagggcaaigtggcaa 


701 


720SEQ 


ID NO: 


4342ttgcaagcaagtctUccl 


3013 


3032 


1 


4 


SEQ ID NO: 


3005 


gcaatgtggcaacagaaat 


708 


727SEQ 


ID NO: 


4343atttccataccccgtttgc 


3488 


3507 


1 


4 


SEQ ID NO: 


3006 


caatgtggcaacagaaata 


709 


728SEQ 


ID NO: 


4344tattcttctmccaattg 


13834 


13853 


1 


4 


SEQ ID NO: 


3007 


tggcaacagaaatatccac 


714 


733SEQ 


ID NO: 


4345gtggcttcccatat<gcca 


1895 


1914 


1 


4 


SEQ ID NO: 


3008 


agagaodgggccagtgtg 


737 


756SEQ 


ID NO: 


4346cacattacatttggtctct 


2938 


2967 


1 


4 


SEQ ID NO: 


3009 


tgtgatogdtcaagccca 


762 


771 SEQ 


ID NO: 


4349tgggaaagccgccctcaca 


5218 


5237 


1 


4 


SEQ ID NO: 


3010 


gtgatcgcttcaagcccat 


753 


772SEQ 


ID NO: 


4350atgggaaagccgocQtcao 


5217 


5236 


1 


4 


SEQ ID NO: 


3011 


cagcccacttgctctcatc 


764 


803 SEQ 


ID NO: 


4351 gatgctgaacagtgagctg 


8162 


8171 


1 


4 


SEQ ID NO: 


3012 


gctctcatcaaaggcatga 


704 


813SEQ 


ID NO: 


4352tcataacagtactgtgagc 


10345 


10364 


1 


4 
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SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3013 


ccttatcaactctaatcaa 


3014 




3015 




3016 




3017 




3018 


nHfrrf nfftor*afl naataa 


3019 




3020 


atcaacaacccicttctttci 


3021 


flcaaccdcttctttQCltCIQ 


3022 


9 aaataaacctCQCBtttQ 


3023 


iattftciaaaactctccsQ 


3024 


ttaaaoactctccaooaac 


30Zo 


aactaaaaaaactaaccat 


30Zo 


ctaaaaa aactaaccatct 


3027 


aa a actaaccatctctcia ci 




taaacaaaatatccdflacia 


3029 


caaiaaactaottactoaQ 


3030 


tactoaactaaaaacicclc 


3031 


acctcaatciatoaaacacit 


3032 


aatcacatctctcttticca 


3033 


A tctntctta ccaca G ct a 


3034 


tctctcttaccacaa eta a 


3036 


fnrr:acaantaaitQaGat 


3036 


aceaeaactaattaaaato 


3037 


tcacftlacaaaccHoat 


3038 


Gccttctn ataa atntaat 


3039 


atcaoctacctaataacGC 


3040 


ccttatatacactaaa cca 


3041 


ai^caaaccctacaaGGacG 


3042 


tactaattacctaataoaa 


3043 


taactacactaaaaataaa 


3044 


actacactaaaoatciaafla 


oU40 


ata aaa attacacctattt 


3046 


accatG □ aa ca □ tfaactc 


3047 


acaattaactccaaaactc 


3046 


caaaactcaaotcttcaat 


OA J A 

3049 


ranocictacaaaaaatoa 

uay ywuny vy y h ly y 


3050 


ccaggaggttcttcttcag 


3051 




3052 


tttccttgatgatgcttct 


3053 


ggagataagcgactggctg 


3054 


gctgcctatcttatgttga 


3055 


actttgtggcttcccatat 


3056 


gccaatatcttgaactcag 


3057 


aatatcttgaactcagaag 


3058 


ctcagaagaattggatatc 


3059 


aagaattggatatccaaga 


3060 


agaattggatatccaagat 


3081 
30B2 


tggatatccaagatctgaa 
atatccaagatctgaaaaa 



819 838SEQIDNO: 

820 839SEQ ID NO: 

892 911SEQIDN0: 

893 912SEQ ID NO: 
016 935sEQ ID NO; 
924 943SEQ ID NO: 

997 1016SEQIDNO: 

998 1017SEQIDNO: 
1002 1021SEQIDNO: 
1031 1050SEQIDNO: 
1090 1109SEQ1DNO: 
1094 1113sEaiDN0: 
1110 1120SEQIDNO: 
1112 1131 SEQ ID NO: 
1117 1136SEQ1DNO: 
1132 1161SEQIDN0: 
1162 1181 SEQ ID NO: 
1174 1193SEQIDNO: 
1188 1207SEQIDNO: 
1204 1223SEQIDNO: 

1210 1229SEQIDNO: 

1211 1230SEQIDNO: 

1218 1237SEQIDNO: 

1219 1238SEQIDNO: 
1248 1267SEQIDNO: 
1332 1651sEQIDNO: 
1349 1368SEQ1DNO; 
1440 1459SEQ1DNO: 
1480 MOOsEQlDNO: 
1516 1535SEQIDNO: 
1546 1565SEQIDNO: 
1648 1567SEQIDNO: 
1560 1579SEQIDNO: 
1610 1629SEQIDNO: 
1618 1637SEQIDNO: 
1629 1648SEQIDNO: 
1703 1722SEQIDNO: 
1738 1757SEQIDNO: 
1744 17e3sEQIDNO: 
1759 1778SEQIDNO: 
1781 1800SEQIDNO: 
1798 1815SEQIDNO: 
1890 1909SEQIDNO: 
1910 1929SEQIDNO! 
1913 1932SEQIDNO: 
1924 1943SEQIDNO: 

1929 1948SEQIDNO; 

1930 1949SEQIDNO; 
1935 1954SEQIDNO; 
1938 1957SEQIDNO 



4353ctgagtgggtttatcaagg 

4354 gctgagtgggtttatcaag 

4355ltgcaatgagctcatggct 

4356gttgcaatgagctcatggc 

4357gtaggaataaatggagaag 

4358ttattgctgaatccaaaag 

4359aaagccatcaclgatgatc 

4360caaagGcatcactgatgat 

4361 tcacaaatcctttggctgt 

4362caaaatagaagggaatctt 

4363ctggtaactactttaaaca 

4364gttcaGtgaatttettcaa 

4365atggcattttttgcaagtt 

4366agattgatgggcagtlcag 

4367ctcaaagaatgacttttft 

4366tctccagataaaaaaGtca 

4369 ctcagatcaaagttaattg 

4370gagggtagtcataacagta 

4371 actgttgactcaggaaggc 

4372tggccacatagcatggact 

4373cagctgacctcatcgagat 

4374tcagctgacctcatcgaga 

4375acctgcaccaaagctggca 

4376 caccaaaaaccccaatggc 

4377accagatgctgaacagtga 

4378accacHacagctagBggg 

4379gggcgacctaagttgtgac 

4380tggctggtaacctaaaag9 

4381 ggtcctttatgattatgtc 

4382ttcccaaaagcagtcagca 

4383ttcaggtccatgcaag1ca 

4384tcttgaacacaaagtcagt 

43B5aaatgaaagtaaagatcat 

4386gagtaaaccaaaacttggt 

4387gagttactgaaaaagctgc 

4386 attggatatccaagatctg 

4389ccatgacctccagctcctg 

4390ctgaaatacaatgctctgg 

4391 gaaaaacttggaaacaaoc 

4392agaatccagatacaagaaa 

4393 cagcatgcctsgtttctcc 
4394tcaatatcaaaagoccagc 

4395 atatctggaaccttgaagt 

4396 ctgaactcagaaggatggc 
4397cttccattctoaatatatt 
4398gataaaagattacttt9ag 
43g9tcttcaatttattcttctt 
4400atcttcaatttattcttct 
4401 ttcacataccagaattcca 
4402tttttaaccagtcagatat 



12463 


12472 1 


4 




12471 1 


4 




3832 1 


4 




3831 1 


4 




9480 i 


4 




13675 1 


4 




168B 1 


4 




1687 1 


4 




9694 1 


4 


2077 


2086 1 


4 




5514 1 


4 


131 02 


13211 1 


4 




14033 1 


4 


4572 


4501 1 


4 


2578 


2597 1 


4 


1220& 


12226 1 


4 


12273 


12292 1 


4 


10337 


1 0356 1 


1 4 


12*>fi0 


12599 1 


1 4 


PflRfi 

OOwO 


8885 1 


1 4 

1 ~ 


91A0 

£ lUf7 


2188 1 


1 4 


2ifi8 


2187 ' 


1 4 


13QR3 


13982 


1 4 




112fi7 


1 4 


R148 


8167 


1 4 




10843 


1 4 


343Q 


3458 


1 4 




6605 


1 4 




12374 


1 4 


9938 


9957 

w w w f 


1 4 


10917 


10936 

1 W vww 


1 4 




8028 


1 4 


R11R 
Olio 


8137 


1 4 


0024 


9043 


1 4 


13727 


13746 


1 4 


1033 


1052 


1 4 


2dR'Ti 


2504 


1 4 






1 4 




4458 


1 4 






1 A 


9952 


9971 


1 4 


12045 


12064 


1 4 


10737 


10756 


1 4 


14000 


14019 


1 4 


13378 


13397 


1 4 


7273 


7292 


1 4 


13825 


1 3844 


1 4 


13824 


13843 


1 4 


6325 


8344 


1 4 


10186 


10204 


1 4 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3063 tatccaagatctgaaaaag 

3064 caagatctgaaaaagttag 

3065 aagatctgaaaaagttagt 
3068 tgaaaaagttagtgaaaga 

3067 tccaactgtcatggacttc 

3068 tcagaaaattctctcggaa 
3060 ttccatcacttgacccagc 

3070 cccagcctcagccaaaata 

3071 agcctcagccaaaatagaa 

3072 atctiatatttgatccaaa 

3073 tcttatatttgatocaaat 

3074 cttcctaaagaaagcatgc 
3076 ctaaagaaagcatgctgaa 

3076 taaagaaagcatgctgaaa 

3077 gagattggcttggaaggaa 

3078 ctttgagccaacattggaa 

3079 cagacagtgtcaacaaagc 

3080 cagtgtcaacaaagctttg 

3081 agtgtcaacaaagctttgt 

3082 ctgalggtgtctctaaggt 

3083 tgatggtgtctdaaggtc 

3084 aaacatgagcaggatatgg 

3085 gaagctgattaaagatttg 

3086 aaagatttgaaatccaaag 

3087 gatgggtgcccgcactctg 

3088 gggatcccccagatgatig 

3089 ttttcttcactacatcttc 

3090 tcttcactacaicttcatg 
3001 tacatcttcatggagaatg 

3092 ttcatggagaatgcctttg 

3093 tcatggagaatgcctttga 

3094 tttgaactccccactggag 

3095 ttgaactccccactggagc 

3096 tgaadccccactggagd 

3097 cactggagctggattacag 

3098 actggagctggattacagt 

3099 agttgcaaatatcttcatc 

3 1 00 gttgcaaatatcttcatct 
3 \ 01 aaatatcttcatctggagt 

3 ] Q2 taaaactggaagtagccaa 

31 03 ggctgaactggtggcaaaa 

3104 tgtggagtttgtgacaaat 

3105 ttgtgacaaatatgggcat 
3 1 05 atgaacaccaacttcttcc 

3107 cttccacgagtcgggtctg 

31 08 gagtcgggtctggaggctc 

31 09 cctaaaagctgggaagctg 

31 1 0 agctgggaagctgaagttt 

3111 ccagattagagctggaact 

3112 ggataccctgaagtttgts 



1939 1958SEQIDNO: 

1943 1982SEQIDNO: 

1944 1963SEQIDNO: 
1950 1969SEQIDNO: 
1990 2009SEQIDNO: 
2007 2026SEQIDNO: 
2052 2071 SEQ ID NO: 
2065 2084SEQIDNO: 
2068 2087SEQIDNO: 

2091 2110SEQIDNO: 

2092 21 11 SEQ ID NO: 
2117 2136SEQIDN0: 

2121 2140SEQIDNO: 

2122 2141SEQIDN0: 
2183 2202SEQIDNO: 
2206 2225SEQIDN0: 
2253 2272SEQIDNO; 

2257 2276SEQIDNO: 

2258 2277SEQIDNO: 

2298 2317SEQ1DNO: 

2299 2318SEQIDNO: 
2351 23705EQIDNO: 
2395 2414SEQIDNO: 
2405 2424SEQIDNO: 
2618 2537SEQIDNO: 
2640 2559SEQIDNO: 
2593 2612SEQIDNO: 
2696 2615SEQIDNO: 
2603 2622SEQIDNO: 

2609 2628SEQIDNO: 

2610 2829SEQIDNO: 

2624 2643SEQIDNO: 

2625 2644SEQ ID NO: 

2626 2645SEQIDN0: 

2635 2654SEQIDNO: 

2636 2666SEQIDNO; 
2662 2671 SEQ ID NO: 
2653 2672SEQIDNO: 
2658 2877SEQ1DNO: 
2703 2722SEQIDNO: 
2726 2747SEQIDNO: 
2768 2777SEQIDNO: 
2766 2785SEQIDNO: 
2819 2838SEQIDNO: 
2833 2852SEQIDNO: 
2840 2869SEQIDNO; 
2866 2885SEQIDNO; 
2872 2891 SEQ ID NO 
3114 3133sEQ|DNa 
3208 3227SEQIDNO 



4403ctttttaaccagtcagata 
4404ctaaatt]cccatggtcttg 
4405actaaattoccatggtctt 
44C5tcmctcgggaatattca 
4407gaagcacatatgaBctgga 
4408ttcctttaacaattcctga 
4409 gctgacatagggaatggaa 
441 Otattctatocaagattggg 

441 1 ttctatccaagattgggct 

44 1 2 tttgaaaaacaaagcagat 
4413attttttgcaa9ttaaaga 
4414ocatggcattatgatgaag 
4415ftcagggtgtggagtttag 
4416tttcttaaacattccttta 
441 71tccctocattaagttctc 
441 8ttccaatgaccaagaaaag 
441 9gcttactggacgaactctg 
4420 caaattcctggatacactg 
4421 acaagaatacgtctacad 
4422acctcggaaGaatcctcag 
4423 gacctgcgcaacgagotca 
4424 ccatg atctacatttgttt 
4425caaaaacattttcaacttc 
4426ctttaagttcagcatcttt 
4427ca9atttgaggattccatc 
4428caatcacaagtcgattccc 
4429 gaagtgtcagtggcaaaaa 
4430 catggcattatgatgaaga 
4431 cattatggaggcccatgta 
4432caaaatcaactttaatgaa 
4433tcaacacaatcttcaatga 
4434ctccccaggacctttcaaa 
4435gctccccagg acctttcaa 
4436 agctccccaggacctttca 
4437ctgtttctgagtcccagtg 
4438actgtttctgagtcccagt 
4439gatgatgccaaaatcaact 
4440agatgatgccaaaatcaac 
4441 actcagaaggatggcattt 
4442ttggttacaggaggcttta 
4443ttttcttttcagcocagcc 
4444 attttcaagcaaatgcaca 
4445atgcgtctaccttacacaa 
4446g9aagctgaagtttatcat 
4447cagagctatcactgggaag 
4448 gagcttactggacgaactc 
4449cagcctccccagccgtagg 

4450 aaactgttaatttacagct 

4451 agtttooggggaaacctgg 
4452tacagtattctgaaaatcc 



10184 


10203 1 


4 


4973 


4992 1 


4 


4972 


4991 1 


4 


10630 


10649 1 


4 


13945 


13964 1 


4 


9501 


9520 1 


4 


8441 


8460 1 


4 


7820 


7839 1 


4 


7822 


7841 1 


4 


11821 


11840 1 


4 


14019 


14038 1 


1 4 


3614 


3633 1 


4 


5694 


6713 1 


4 


9490 


9509 1 


1 4 


11709 


11728 1 


1 4 


11068 


11087 1 


1 4 


6142 


6151 1 


1 4 


9857 


9876 - 


1 4 


4359 


4378 * 


1 4 


3333 


3352 ' 


1 4 


8831 


8860 ' 


1 4 


6798 


8815 ' 


1 4 


5287 


5306 ' 


1 4 


7614 


7633 


1 4 


7983 


8002 ' 


1 4 


9083 


9102 ' 


1 4 


10382 


10401 ' 


1 4 


3615 


3634 


t 4 


9445 


9464 ' 


1 4 


6607 


6626 


1 4 


13116 


13135 ' 


1 4 


9842 


9861 


1 4 


9841 


9860 ' 


1 4 


9840 


9859 


1 4 


9344 


9363 ' 


1 4 


9343 


9362 


1 4 


6599 


6618 


1 4 


6598 


6617 ' 


1 4 


14004 


14023 


1 4 


7600 


7619 


1 4 


9228 


9247 


1 4 


8538 


8557 


1 4 


9521 


9540 


1 4 


2877 


2896 


1 4 


5235 


5254 


1 4 


6140 


6159 


1 4 


12120 


12139 


1 4 


5463 


5482 


1 4 


12726 


12746 


1 4 


8393 


8412 


1 4 
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SEQ ID NO: 311 3 ctgaggctaccatgacatt 
8EQ 10 NO: 3114 tgtccagtgaagtccaaat 
SEQ ID NO: 31 1 5 aattccggattttgatgtt 
SEQ ID NO: 31 1 6 ttccggattttgatgttga 
SEQ ID NO: 31 1 7 cggaacaatcctcagagtt 
SEQ ID NO: 31 1 8 tcdcagaottaatgatga 
SEQ ID NO: 31 1 9 ctcaccctggacattcaga 
SEQ ID NO: 3120 cattcagaacaagaaaatt 

SEQ ID NO: 3121 actgaggtcgccctcatgg 

SEQ ID NO: 3122 ttatttccataccccgttt 

SEQ ID NO: 3123 gtttgcaagcagaagccag 

SEQ ID NO: 3124 tttgcaagcagaagccaga 

SEQ ID NO: 3125 ttgcaagcagaagccagaa 

SEQ ID NO: 3126 ctgcttctccaaatggact 

SEQ ID NO: 3127 tgctacagcttatggctcc 

SEQ ID NO: 3126 acagcttatggctccacag 

SEQ ID NO: 3129 tttccaagagggtggcatg 

SEQ ID NO: 3130 ccaagagggtggcatggca 

SEQ ID NO: 3131 gtggcatggcatlatgatg 

SEQ ID NO: 3132 tgatgaagagaagattgaa 

SEQ ID NO: 3133 gaagagaagattgaatttg 

SEQ ID NO: 3134 sagaagattgaatttgaat 

SEQ ID NO: 3135 tttgaatggaacacaggca 

SEQ ID NO: 3136 aggcaccaatgtagatacc 

SEQ ID NO: 3137 caaaaaaatgacttccaat 

SEQ ID NO: 3138 aaaaaaatgacttccaatt 

SEQ ID NO: 3139 aaaaaatgacttocaattt 

SEQ ID NO: 3140 cagagtocctcaaacagac 

SEQ ID NO: 3141 aaattaatagttgcaatga 

SEQ ID NO: 3142 ttcaacctccagaacatgg 

SEQ ID NO: 3143 tgggattgcx:agacttcca 

SEQ ID NO: 3144 cagtttgaaaattgagatt 

SEQ ID NO: 3145 gaaaattgagattcctttg 

SEQ I D NO: 3 1 46 tttgccttttggtggcaaa 

SEQ ID NO: 3147 ctccagagatctaaagalg 

SEQ ID NO: 3148 tctaaagatgttagagact 

SEQ ID NO: 3149 ctgtgggattccatctgcc 

SEQ ID NO: 3150 atctgccatctcgagagtt 

SEQ ID NO: 3151 tclcgagagttccaagtcc 

SEQ ID NO: 3152 agtccctacttttaccatt 

SEQ ID NO: 31 63 acttttaccattcccaagt 

SEQ ID NO: 31 54 cattcccaagttgtatcaa 

SEQ ID NO: 31 55 accacatgaaggctgactc 

SEQ ID NO: 3156 tttcctacaatgtgcaagg 

SEQ ID NO: 3157 ctggagaaacaacatatga 

SEQ ID NO: 3158 atcalgtgalgggtctcta 

SEQ ID NO: 31 59 catgtgalgggtctctacg 

SEQ ID NO: 3160 ttctagattcgaatatcaa 

SEQ ID NO: 3161 tggggaccacagatgtctg 

SEQ ID NO: 3162 ctaacactggccggctcaa 



3252 3271 SEQ ID NO: 


4453aatga8Ctcatggcttcag 


3817 


3836 


1 4 


3297 3316SEQIDN0: 


4454atttt9agaggaatcgaca 


6357 


6376 


1 4 


3313 3332SEQ ID NO: 


4455aacacatgaatcacaaaU 


8938 


8957 


1 4 


3315 3334SEQIDN0: 


4456tcaaaacgagcttcaggaa 


13207 


13226 


1 4 


3337 335esEQIDNO; 


4457aacttgtacaactggtccg 


4211 


4230 


1 4 


3345 3364SEQIDNO: 


4458tcatcaettggttacagga 


7593 


7612 


1 4 


3392 3411SEQIDNO: 


4459tclgcagaacaatgctgag 


12439 


12458 


1 4 


3403 3422SEQIDNO: 


4460aattgactttgtagaaatg 


8104 


8123 


1 4 


3422 3441SEQ1DNO: 


4461 ccatgcaagtcagcccagf 


10924 


10943 


1 4 


3486 3505SEQIDNO: 


4462aaactgcctatattgataa 


13880 


13899 


1 4 


3501 3520SEQ ID NO: 


4463ctggacttctcttcaaaac 


5408 


5427 


1 4 


3502 3521 SEQ iQf^Q. 


4464tctgggtgtcgacagcaaa 


5272 


5291 


1 4 


3503 3522SEQIDNO: 


4465ttctggglgtcgacagcaa 


5271 


6290 


1 4 


3554 3573SEQIDNO: 


4466agtcaagattgatgggcag 


4567 


4586 


1 4 


3577 3596SEQIDNO: 


4467ggaggctttaagttcagca 


7609 


7628 


1 4 


3581 3600SEQIDNO: 


4488ctgtatagcaaattcclgt 


5897 


5916 


1 4 


3600 3619SEQIDNO: 


4469catggacttcttctggaaa 


8877 


6896 


1 4 


3603 3622SEQ1DN0: 


4470tgcccagcaagcaagttgg 


9361 


9380 


1 4 


3611 3630SEQIDNO: 


4471 catccttaacaccttccac 


8071 


8090 


1 4 


3626 3644SEQ ID 1^0: 


4472ttcactgttcctgaaatca 


7871 


7890 


1 4 


3629 3648SEQIDNO: 


4473caaaaacattttcaacnc 


6287 


6308 


1 4 


3632 3651QEQIDNO: 


4474attoataatoccaactctc 


8278 


8297 


1 4 


3644 3663SEQIDNO: 


4475lgcctltgtgtacaccaaa 


11236 


11255 


1 4 


3658 3677sEQIDNO: 


4476ggtaacctaaaaggagcct 


5591 


5610 ' 


1 4 


3676 3695sEQIDNO: 


4477attgaagtacctacttttg . 


8366 


8385 ' 


1 4 


3677 3696SEQIDNO: 


4478 aattgaagtacctactttt 


8365 


8384 ' 


1 4 


3678 3697SEQIDNO: 


4479 aaatccaatclcctctm 


8406 


8425 ' 


1 4 


3760 3779SEQIDN0: 


4480 gtctgtgggattccatctg 


4090 


4109 ' 


1 4 


3803 3822SEQIDNO: 


4481 tcataagttcaatgaattt 


13186 


13205 1 


1 4 


3899 3918SEQIDNO: 


4482ccattgaccagafgctgaa 


8142 


8161 1 


4 


3915 3934SEQIDNO: 


4483tggaaatgggcctgcccca 


8903 


8922 1 


4 


3994 4013SEQIDNO: 


4484aatcacaactcctccact9 


9541 


9560 1 


4 


4000 4019SEQIDNO: 


4485 caaaactaccacacatttc 


13694 


13713 1 


4 


4015 4034SEQIDNO: 


44861ttgagaggaatcgacaaa 


6359 


6378 1 


4 


4036 4055SEQIDNO: 


4487 catcaattggttacaggag 


7694 


7613 1 


4 


4045 4064SEQIDNO: 


4488 agtcx^ttcatgtccctaga 


10033 


10052 1 


4 


4092 41 11 SEQ ID NO: 


4489ggcattttgaaaaaaacag 


9735 


9754 1 


4 


4104 4123SEQIDNO: 


4490aactctcaaacoctaagat 


8556 


8576 1 


4 


4112 4131SEQIDNO: 


4491 ggacattoctctagcgaga 


8215 


8234 1 


4 


4126 4145SEQIDNO: 


4492aatgaatacagccaggaGt 


6086 


6105 1 


4 


4133 4162SEQIDNO: 


4493actttgtagaaatgaaagt 


8109 


8128 1 


4 


4141 4160SEQIDNO: 


4494ttgaaggacttcaggaatg 


12009 


12028 1 


4 


4284 4303SEQIDNO: 


4495gagtaaaccaaaacttggt 


0024 


9043 1 


4 


4317 4336SEQ ID NO: 


4496 cctttaacaattcctgaaa 


9503 


9522 1 


4 


4338 4357SEQIDNO: 


4497tcattctgggtctttccag 


11035 


11054 1 


4 


4378 4397sEaiDNO: 


4498tagaattacagaaaatgat 


6565 


6584 1 


4 


4380 4309SEQIDNO: 


4499cgtaggcaccgtgggcatg 


12133 


12152 1 


4 


4407 4426SEQIDNO: 


4500ttgatgatgctgtcaagaa 


7308 


7327 1 


4 


4499 4518SEQIDNO: 


4501 cagaattccagcttcccca 


8334 


8353 1 


4 


4644 4663SEQIDNO: 


4502ttgaggctattgatgttag 


3984 


7003 1 


4 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3163 taacactggccggcteaat 

3164 aacactggccggctcaatg 

3165 clggccggctcaatggaga 

3166 agataacaggaagatatga 
31Q7 Iccctcacctccacctctg 

3168 agctgactttaaaatctga 

3169 ctgactttaaaatctgaca 
3 •I 70 caagatggatatgaccttc 

3171 gctgcgttctgaatatcag 

3 1 72 cgttctgaatatcaggctg 
3^73 aattcccatggtcttgagt 

3174 tggtcttgagttaaatgct 

3175 cttgagttaaatgctgaca 

3 1 76 ttgagttaaatgctgacat 

3177 tgagttaaatgctgacatc 

3173 acttgaagtgtagtctcct 

31 79 agtgtaglctcctggtgct 

3180 gtgctggagaatgagctga 

3181 ctggggcatctatgaaatt 

3182 alggccgcttcagggaaca 
31 63 ttcagtctggatgggaaag 
31 g4 ccatgattctgggtgtcga 
3165 aaaacattttcaacttcaa 

3186 cttaagctctcaaatgaca 

3187 ttaagctctcaaatgacat 

3188 catgatgggctcatatgck 
3169 tgggcfcatatgctgaaat 
31 go actggacttctcttcaaaa 

3191 acttctcttcaaaacttga 

3192 ctgacaagttttataagca 

3193 aagttttataagcaaactg 
3104 ctgttaatttacagctaca 

3195 ttacagctacagccctatt 

3196 tctggtaactactttaaac 

3197 tttaaacagtgacctgaaa 

3198 ttaaacagtgacctgaaat 

3 1 99 cagtgacctgaaatacaat 

3200 tgtggctggtaaoctaaaa 

3201 ttatoagcaagctataaag 

3202 ggttcagggtgtggagttl 

3203 attcagactcactgcattt 

3204 ttcagactcactgcatttc 

3205 tacaaatggcaatgggaaa 

3206 gctgtatagcaaattcctg 

3207 tgagcagacaggcacdgg 

3208 ggcacctggaaactcaaga 

3209 tgaatacagccaggacttg 

321 0 gaatacagccaggacttgg 

321 1 ctggacgaactctggctga 

3212 ttttactcagtgagcccat 



4645 4684SEQIDNO: 


4503 attgaggctattgatgtta 


6983 


•ynno "1 
700^ 1 


A 


4646 4665SEQIDNO: 


4504cattgaggctattgatgtt 


6982 


7001 1 


•t 


4650 4669SEQIDNO: 


4506tctccatctgogctaccag 


12073 


1209a 1 


A 
H 


4713 4732SEQIDNO: 


4506tcatotccmcttcatct 


d AAA n 

10210 


102Z8 1 


A 
4 


4745 4764SEQIDNO: 


4507 cagatatatatctceggga 


04 OA 

o1 84 


oaOo 1 


A 
«• 


4818 4837SEQIDNO: 


4508tcaggctcttcagaaagct 


7930 


7949 1 


A 
•1 


4820 4839SEQIDNO: 


4509tgtcaagataaacaatcag 


o74C/ 


Q7CQ 1 


yf 


4873 4892SEQIDNO: 


461 Ogaagtagtactgcatcttg 




OoOa i 


A 
*? 


4909 492BSEQIDNO: 


451 1 ctgagtoccagtgcccagc 


9O0U 


o009 1 


A 


4913 4932SEQ1DNO: 


451 2cagcaagtacctgagaacg 


AM H 


OQOU 1 


A 
■f 


4976 4995SEQ ID NO: 


4513 actcagatcaaagnaatt 




1 1 1 


A 


4984 5003sEQlDNO: 


451 4agcacagtacgaaaaacca 




1 UO^O 1 


A 


4988 5007SEQIDNO: 


451 Stgtccctagaaatctcaag 




1 UUO 1 1 


A 


4989 5008SEQIDNO: 


451 6atgtccctagaaatctcaa 




lUUOU 1 


A 


4990 50Q9SEQIDNO: 


451 7gatggaaccc(ctccctca 


ATTQO 


4f OA I 


A 


6094 6113SEQ ID NO: 


451 Oaggaaactcagatcaaagt 




1 AADD 1 


A 


6100 5119SEQIDNO: 


451 9agcagccagtggcaccact 


•lOCt A 

i2o14 


idOOO 1 


A 


6114 5133SEQIDNO: 


4520 tcagccaggtttatagcac 


7704 


f too \ 


A 


5151 5170SEQIDNO: 


4521 aatttctgattaccaccag 






1 A 


6178 5107SEQIDNO: 


4522 tgttttttggaaatgccat 


0049 


OOOO 1 


1 A 

1 4 


5207 522esEQIDNO: 


A AM^^M^ AAA - — ^ -IXA . ■ 

4523 ctttgacaggcattttgaa 


B7^f 


9/ 40 ' 


1 A 


5266 5284SEQIDNO: 


4524 tcgatgcacatacaaatgg 


Oooo 


OaOr 


1 A 

1 4 


5289 5308SEQIDNO: 


A^M^tmAM.^ A AL A AMAA 

4525ttgatgttagagtgctttt 






i A 


5324 5343SEQIDNO: 


4526tgtcctacaacaagttaag 


7255 


?OTA * 

7a74 


1 A 

1 4 


5325 5344SEQIDNO: 


A PW ft A A - . 1 A. — ^_ 

4627 atgtcxjtacaacaagttaa 


7254 




1 A 


5341 5360SEQIDNO: 


^ ^ M. M.m - - _ M. - — t — 

4528 agcatctttggctcacatg 


76 Z4 


r04o 


1 A 

1 4 


5346 5365SEQIDNO: 


4529 atttetcaaaagaagccca 


1ao4Z 


lAt^O 1 


1 A 


5407 5426SEQIDNO: 


^ — *fc — * A—. -— -— —A 

4530ttttggcaagctatacagt 


oOOU 


0099 


1 A 


5412 5431 SEQ ID NO: 


4531 tcaattgggagagacaagt 


DOU4 


00 AO 


1 A 


5445 5464SEQ ID NO: 


A A A 1 A A AAA. A __ 

4532 tgctttgtgagtttatcag 


9693 


9# 1£ 


4 A 
1 4 


5450 5469SEQIDNO: 


4533cagtcatgtagaaaBactt 


AAOQ 


•t*T*rO 


1 A 


5466 5485SEQIDNO: 


4534tgtactggaaaacgtacag 


Oaoo 


D4U/ 


1 A 


5474 5493SEQIDNO: 


A ^ a. _ AA * - - — A 

4535aatatigatcaatttgtaa - 






1 *t 


5494 5513SEQIDNO: 


4536gtttgaaaaacaaagcaga 


1 IOaU 


1 I009 


1 A 


550B 3525SEQIDNO: 


A — MA Mm A fc* ^ ^ AAA — t — ^ 

4537tttcatt1gaaegaataaa 




r vOl 


1 A 


5507 5526SEQIDNO: 


4538atttcaagcaagaacttaa 


•1 n^4A 
10434 


10400 


1 A 
1 4 


5512 5531 SEQ ID NO: 


A A A ft ... AA. A - 

4539 attggcgtggagcttactg 


Biol 


b lOU 


■1 A 
1 4 


5584 5503SEQIDNO: 


4540ttttgctggagaagccaca 


10700 


lUf 04 


1 A 


5657 6876SEQIDNO: 


4541 ctttgcactatgttcataa 




lAr OO 


•1 A 
1 4 


5692 5711SEQIDNO: 


4542 aaacacctaagagtaaacc 




9UOO 


1 4 


5775 5794SEQ1DNO: 


4543aaatgctgacatagggaat 


8437 


8456 


1 4 


5776 57G5SEQIDNO: 


4544gaaatattatgaacttgaa 


13312 


13331 


1 4 


5848 5867SEQIDN0: 


4545tttcctaaagctggatgta 


11176 


11106 


1 4 


6896 6915SEQ1DNO: 


4546caggtccatgcaagtcagc 


10919 


10938 


1 4 


6043 60623EQIDNO: 


4547 ccagcttccccacatctca 


8341 


6360 


1 4 


6053 6072SEQIDNO: 


4548tcttcgtgttteaactgoc 


11221 


11240 


1 4 


6088 6107SEQIDNO: 


4549caag1aagtgctaggttca 


9380 


9399 


1 4 


6089 6108SEQIDNO: 


4550 ccaacacttacttgaattc 


10668 


10687 


1 4 


6147 6168SEQIDNO: 


4551 tcagaaagctaccttccag 


7939 


7958 


1 4 


6201 6220SEQIDNO: 


4552 atggacttcttctggaaaa 


8878 


8897 


1 4 
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SEQ ID NO: 321 3 gatgagagatgccgttgag 
SEQ ID NO: 3214 aattgttgcttttgtaaag 

SEQ ID NO: 3215 cttttgtaaagtatgataa 

SEQ ID NO: 3216 tttgtaaagtatgataaaa 

SEQ ID NO: 3217 tccattaacctcccalUt 

SEQ ID NO: 3218 ccatlaacctcccattttt 

SEQ ID NO: 321 g cttgcaagaatattttgag 

SEQ ID NO: 3220 agaatattttgagaggaat 

SEQ ID NO: 3221 attatagttgtactggaaa 

SEQ ID NO: 3222 gaagcacatcaatatfgat 

SEQ ID NO: 3223 acatcaatattgatcaatt 

SEQ ID NO: 3224 gaaaactcccacagcaagc 

SEQ ID NO: 3225 ctgaattcaltcaattggg 

SEQ ID NO: 3226 tgaattcattcaattggga 

SEQ ID NO: 3227 aactgactgctctcacaaa 

SEQ ID NO: 3228 aaaagfatagaattacaga 

SEQ ID NO: 3229 atcaactttaatgaaaaac 

SEQ ID NO: 3230 tgatttgaaaatagctatt 

SEQ ID NO: 3231 atttgaaaatagctattgc 

SEQ ID NO: 3232 attgctaatattattgatg 

SEQ ID NO: 3233 gaaaaattaaaaagtdtg 

SEQ ID NO: 3234 actatcatatccglgtaat 

SEQ ID NO: 3235 tattgattttaacaaaagt 

SEQ ID NO: 3236 ctgcagcagcttaagagac 

SEQ ID NO: 3237 aaaacaacacattgaggct 

SEQ ID NO: 3236 ttgagcatgtcaaacactt 

SEQ ID NO: 323g tttgaagtagctgagaaaa 

SEQ ID NO: 3240 ttagtagagtlggcccacc 

SEQ ID NO: 3241 tgaaggagactattcagaa 

SEQ ID NO: 3242 gagactattcagaagctaa 

SEQ ID NO: 3243 aattagttggatttattga 

SEQ ID NO: 3244 gcttaatgaattatctttt 

SEQ ID NO: 3245 ttaacaaattccttgacat 

SEQ ID NO: 3246 aaaftaaagteatttgatt 

SEQ ID NO: 3247 sactcaatggtgaaattca 

SEQ ID NO: 3246 gaaattcaggctctggaac 

SEQ ID NO: 3249 actaccacaaaaagctgaa 

SEQ ID NO: 3250 ccaaaataaccttaatcat 

SEQ ID NO: 3251 aaataaccttaatcatcaa 

SEQ ID NO: 3252 tttaagttcagcatctttg 

SEQ ID NO: 3253 caggtttatagcacacttg 

SEQ ID NO: 3254 gtlcactgttcctgaaatc 

SEQ ID NO: 3255 cactgttcctgaaatcaag 

SEQ ID NO: 3256 actgttcctgaaatcaaga 

SEQ ID NO: 3267 gcctgcctttgaagtcagt 

SEQ ID NO: 3258 taacagatttgaggattoc 

SEQ ID NO: 3259 gttttccacaccagaattt 

SEQ ID NO: 3260 tcagaaocattgaccagat 

SEQ ID NO: 3261 tagcgagaatcaccctgcc 

SEQ ID NO: 3262 ccttaatgattttcaagtt 



6241 6260sEQIDNa 
6277 6296SEQIDNO 
6285 6304SEQ ID NO; 
6287 6306sEQIDNoi 

6320 6339SEQIDN0: 

6321 6340SEQIDNO: 
6346 6366SEQIDNO: 
6362 6371 SEQ ID NO: 
6380 6399SEQIDNO: 
6416 6434SEQIDNO: 
6420 6439sEQiDNO: 
6465 6484SEQIDNO: 
6494 6513SEQ1DNO: 
6405 6514sEa ID NO: 
6540 6569sEQ ID NO: 
6658 6577SEQIDNO: 
6611 6630SEQIDNO: 

6694 8713SEQIDNO: 

6695 6715SEQIDNO: 
6710 6720SEQIDNO: 
6737 6756SEQIDNO: 
6762 6781sEQIDNO: 
6823 6842SEQIDNO: 
6014 6933SEQIDNO: 
6973 6992SEQIDNO: 
7059 7078SEQIDNO: 
7100 7119SEQIDNO: 
7199 7218SEQIDNO: 
7227 7246SEQIDNO: 
7232 7251SEQIDNO: 
7293 7312SEQIDNO: 
7327 7346SEQIDNO: 
7365 7384SEQIDN0; 
7394 7413SEQIDNO: 
7464 7483SEQIDNO: 
7476 7494SEQIDNO: 
7492 75113EQIDN0: 
7578 7597SEQIDN0: 
7581 7600SEQIDNO: 
7615 7634SEQIDN0: 
7739 7758SEQIDNO: 
7870 7889SEQIDNO: 

7873 7892SEQIDNO: 

7874 7893SEQIDNO: 
7909 7928SEQIDNO: 
7980 7999SEQIDNO: 
8060 8069SEQIDNO: 
8136 8155SEQIDNO: 
8228 8245SEQIDNO: 
8299 8318SEQIDNO; 



4553ctcatctDc;mcttcato 

4654 cmtctaaacttgaaatt 

4555 ttatgaacttgaagaaaag 

4556ttttcacattagatgcaaa 

4557aaaatt9atgatatctgga 

4558aaaagggtcatggaaatgg 

4559 ctcaattttgattttcaag 

4560attccctccattaagttet 

4561 tttcaagcaagaacttaat 

4562atcagttcagataaacttc 

4563aattccctgaagttgatgt 

4584gctttctcttccacatttc 

4565ccGat1tacagatcttcag 

4560tcccatttacagatcttca 

4567tttgaggattccatcagtt 

4668tctggctccctcaactttt 

4569gtttattgaaaatattgat 

4570aatattattgatgaaatca 

4571 gcaagaacttaatggaaat 

4572catcacactgaataccaat 

4573caagagcttatgggatttc 

4574attactttgagaaattagt 

4575 acttgacttcagagaaata 

4576 gtcttcagtgaagctgcag 

4577agcctcacctcttactm 

4578aagtagc1gagaaaatcaa 

457gttttcacattagatgcaaa 

4580 ggtggactcttgctgctaa 

4581 ttctcaatlttgattttca 

4582ttagccacagctctgtctc 

4583tcaagaagcttaatgaatt 

4584aaaacgagcttcaggaagc 

4585atgtcctacaacaagtlaa 

4586aatcctttgacaggcattt 

4587 tgaaattcaatcacaagtc 

4588gttctcaattttgatmc 

45d9ttcaggaactattgctagt 

4590a1gatttccctgaccttgg 

4591 ttgaagtaaaagaaaattt 

4592caaatctggatttdtaaa 

4693 caagggttcactgttcctg 

4594 gattctcagatgagggaac 

4595 cttgaacacaaagtcagtg 
4596tcttgascacaaagtcagt 
4697actgttgactcaggaaggc 
4598 ggaagcttctcaagagtta 
4599 aaatttctctgctggaaac 
4600atctgcagaacaatgctga 
4601 ggcagctlctggctlgda 
4602aactgittoactcaggaagg 



10209 


10228 


1 4 


9064 


9083 


1 4 


13318 


13337 


1 4 


6421 


8440 


1 4 


10727 


10746 


1 4 


8893 


8012 


1 4 


6528 


8547 


1 4 


11708 


11727 


1 4 


10435 


10454 


1 4 


7999 


8018 


1 4 


11487 


11506 


1 4 


■lit 


10079 


1 4 






11371 


11390 


1 4 


11370 


11389 


1 4 


7087 


8006 


1 4 


9050 


9069 


1 4 


6811 


6830 


1 4 


6718 


6735 


1 4 


10441 


10460 


1 4 


10159 


10178 


1 4 


11161 


11180 


1 4 


7281 


7300 


1 4 


11404 


11423 


1 4 


10699 


10718 


1 4 


10571 


10590 


1 4 


7104 


7123 


1 4 


8421 


6440 


i 4 


7776 


7795 1 


1 4 


8526 


8545 1 


1 4 


10301 


10320 1 


1 4 


7320 


7339 1 


4 

• 


13209 


13228 1 


4 


7264 


7273 ^ 


4 


9723 


9742 1 


4 


9076 


9095 1 

W V W V 1 




8525 


8544 1 


A 
f 


10645 


10664 1 


A 
*f 


10950 


10969 1 


A 
t 


10749 


1 Q7RR 1 
1 w r wo 1 


A 

*r 


9480 


9499 1 


4 


7865 


7884 1 


4 


8922 


8941 1 


4 


6006 


6027 1 


4 


8007 


6028 1 


4 


12580 


12599 1 


4 


13222 


13241 1 


4 


9418 


9437 1 


4 


12438 


12457 1 


4 


12301 


12320 1 


4 


12570 


12598 1 


4 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ (D NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



3263 


acataccagaattccagct 


3264 


aatgctgacatagggaatg 


3265 


atgctgacatagggaatgg 


3266 


aaccaoctcagcaaaogaa 


3267 


agcaggtatogcagcttcc 


3268 


tgcacaactctcaaaccct 


3269 


aggagtcagtgaagttctc 


3270 


tttttggaaatgccattga 


3271 


aatggagtgattgteaaga 


3272 


gtcaagataaacaatcagc 


3273 


Iccacaaattgaacatooc 


3274 


ttgaacatccccaaactgg 


3276 


acatccccaaactggactt 


3278 


acttctctagtcaggdga 


3277 


tgaatcacaaattagtttc 


3278 


agaaggacccctcacttcc 


3279 


ttggactgtccaataagat 


3280 


actgtccaataag atcaat 


3281 


ctgt(x;aataagatcaata 


3282 


gttlatgaatctggctccc 


3283 


atgaatctggctccctcaa 


3284 


ctcaacttttctaaacttg 


3286 


ctaaaggcatggcactgtt 


3286 


aag gcatggcactgttfgg 


3287 


atccacaaacaatgaaggg 


3288 


ggaatttgaaagttcgttt 


3280 


aataactatgcactgtttc 


3290 


gaaacaacgagaacattat 


3291 


ttcttgaaaacgacaaagc 


3292 


ataagaaaaacaaacacag 


3293 


aaaacaaacacaggcattc 


3294 


gcattccatcacaaatcct 


3296 


tttgaaaaaaacagaaaca 


3296 


caatgcattagattttgtc 


3297 


caaagctgaaaaatctcag 


3298 


cctggatacactgttccag 


3299 


gttgaagtgtctccattca 


3300 


tltdccatcctaggttd 


3301 


ttctccatcctaggttctg 


3302 


tcattagagctgocagtcc 


3303 


tgctgaadttttaaocag 


3304 


ctcctttettcatcttcat 


3305 


tgtcattgatgcactgcag 


3305 


tgatgcactgcagtacaaa 


3307 


agctctgtctctgagcaac 


3308 


agocgaaattccaattttg 


3309 


ttgagaatgaatttcaagc 


3310 


aaacctactgtctottoct 


3311 


tacttttccattgagtcat 


3312 


tcaggtccatgcaagtcag 



8328 8347sEaiDNO: 

8438 8457SEQIDNO: 

8439 8468SEQIDNO: 
8458 8477SEQIDNO: 
8476 8495SEQIDNO: 
8551 857QSEQIDNO: 
8592 8611 SEQ ID NO: 
8652 8671 SEQ ID NO: 
8729 8748SEQIDNO: 
8741 8760SEQIDNO: 
8787 8808SEQIDNO: 
8795 8814SEQIDNO: 
8799 8818SEQIDNO: 
8814 8833sEQiDNO: 
8944 8963SEQIDNO: 
8968 8987SEQIDNO: 
8988 9007SEQIDNO: 

8992 9011SEQIDNO: 

8993 9012sEQjDNO: 
9041 9060SEQIDNO: 
9045 9064SEQIDNO: 
9059 9078gEQ|DNO: 
9129 9148SEQIDN0: 
9132 9151SEQIDNO: 
9262 9281SEQ1DNO: 
9279 9298SEQIDNO: 
9332 9351 SEQ ID NO: 
9432 0451 SEQ ID NO: 
9599 9618SEQIDNO: 
9648 9667SEQIDNO: 
9654 9673sEQiDNO: 
9667 9686SEQIDNO: 
9740 9759sEQIDNO: 
0757 9776SEQIDNO: 
9817 9836sEaiDNO: 
0863 0882SEQIDNO: 
9890 9900$EQ|DNO: 
9964 9983SEQ1DNO: 
9985 9984SEQIDNO: 

10019 10038 SEQ ID NO: 
1017710196SEQIDNO: 
1021410233SEQIDNO: 
1023410253SEQIDNO; 
1024010269SEQIDNO; 
10309 1 0328SEQ ID NO: 
1040810427SEQIDNO; 
10424 10443 SEQ ID NO; 
10469 10488 SEQ ID NO 
10583 10602 SEQ ID NO 
10018 10937 SEQ ID NO 



4603 agctgccagtccttcatgt 

4604cattaatcctgocatcatt 

4605ocatttgagatcac9gcat 

4608ttcgttttccattaaggtt 

4607ggaagtggcoctgQatgct 

4608agggaaagagaagattgca 

4609gagaacttactatcatcd 

461 Otcaatgaatttattcaaaa 

461 1 tcttttcagcocagccatt 

4612gd)gactttaaaatctgac 

461 3 gggatttcckaaagctgga 

461 4 ccagtttccagggactcaa 

461 5 aagtcgaitcccagcatgt 

461 6 tcagatggaaaaatgaagt 
461 7gaaagtccataatggttca 
4618ggaagaagaggcagcttct 
461 9 atctaaatgcagtagccaa 
4620 attgataaaaccatacagt 
4621 tattgataaaacx»tacag 
4622 gggaatctgatgaggaaac 
4623ttgagttgGccaccatcat 
4624 caagatogcagactttgag 
4625aacagaaacaatgcattag 
4626ccaagaaaaggcacacctt 
4627 ccdaacagatttgaggat 
4628 aaacaaacacaggcattcc 
4629 gaaatactgttttcctatt 
4630ataaac1gcaagatttttc 
4631 gctttccaatgaocaagaa 
4632ctgtgcmgtgagtttat 
4633 gaatttgaaagttogtttt 
4634agg8agtggccctgaatgc 
4635tgttgaaagatttatcaaa 
4636gacaagaaaaaggggattg 
4637ctgagaacttcatGatttg 
4638ctggacttctctagtcagg 
4639tgaatdggctccctcaac 
4640agaatccagatacaagaaa 
4641 cagaatccagatacaagaa 
4642ggacagtgaaatattatga 
4643ctggatgtaaocaGcagca 
4644 atgaagcttgctecaggag 
4645 ctgcgctaccegaaagaca 
4646 tttgagttgcccaccatca 
4647 gttgaccacaagcttagct 
4648 caaagctggcaccagggct 
4649gcttcaggaagcttctcaa 
4650aggaaggccaagccagttt 
4651 atgattatgtcaacaagta 
4652 ctgacatcttaggcactga 



10026 


10045 1 


4 


10005 


10024 1 


4 


9245 


9264 1 


4 


9291 


9310 1 


4 


10072 


10991 1 


4 


13501 


13520 1 


4 


13788 


13807 1 


4 


13194 


13213 1 


4 


9231 


9250 1 


4 


4819 


4838 1 


4 


11172 


11191 1 


4 


12603 


12622 1 


4 


6090 


9109 1 


1 4 


11010 


11029 1 


1 4 


12817 


12836 1 


1 4 


12292 


12311 1 


1 4 


11634 


11653 1 


1 4 


13801 


13910 1 


1 4 


13890 


13909 1 


1 4 


12255 


12274 * 


1 4 


11667 


11686 ' 


1 4 


11653 


11872 • 


1 4 


9749 


9768 ' 


t 4 


11077 


11096 ' 


1 4 


7977 


7996 ' 


1 4 


9655 


9674 " 


1 4 


12836 


12855 


1 4 


13608 


13627 ' 


1 4 


11066 


11084 ' 


1 4 


9690 


9709 


1 4 


9280 


9299 


1 4 


10971 


10990 


1 4 


12933 


12952 


1 4 


10279 


10298 


1 4 


11438 


11457 


1 4 


8810 


8829 


1 4 


9046 


9065 


1 4 


6893 


6912 


1 4 


6892 


6911 


1 4 


13305 


13324 


1 4 


11186 


11205 


1 4 


13772 


13791 


1 4 


12080 

11666 


12099 


1 4 


11685 


1 4 


10547 


10566 


1 4 


13971 


13990 


1 4 


13216 


13235 


1 4 


12691 


12610 


1 4 


12363 


12382 


1 4 


5001 


9020 


1 4 
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QFO in MO 


00 10 


atgcaagtcagcccagttc 


10926' 


t0945Qpo 


in NO' 


4653gaactcagaaggatggcat 


14002 


14021 




4 


QFn in MO 




tgaatgctaacactaagaa 


1 0983 ' 


11002QFO 


in NO- 


4654ttctcaattttgattttca 


8526 


8545 




4 






agaagatcagatggaaaaa 


11004' 


11023SEQ 


ID NO- 


4S55ttttcta8atggaacttct 


12173 


12192 


1 


4 


Qpo in Kin 


OOlw 


ggctattcattctccatcc 


11264' 


11 283 QPO 


in NO' 


4656ggatctaaatgcagtagcc 


11632 


11651 




4 


qcn in NO 


3317 
00 1 f 


aaagttttaactaataaat 


11286' 


I1307QPO 


in NO- 


46d7atttcttaaacattcctU 


9489 


9506 




4 


Qpn In NO 


00 10 


agttttggctgataaattc 


11290' 


11309QPO 


in NO' 


4653gaatctggctcectcaact 


0047 


9066 




4 


CiFO in MO 


331Q 


ctggg ctgaaactaaatga 


11316' 


11335RPO 


in NO 


4659tcattctgggtctttocag 


11036 


11054 




4 


QPO in MO- 


339n 


cagagaaatacaaatctat 

^^^^^^^^^^ ^OT^M ^^^^^^^^ W • • " 


11413' 


11 432 QPO 


in NO' 

iLy INW. 


4660ata9catggacttcttctg 


8873 


8892 




4 


epo in wo 


3351 


gaagtaaaaltccctgaag 


11480' 


11499QFO 


in NO' 

ILf l^lv, 


4662cttctggcttgctaacctc 


12306 


12325 




4 


QPO in NO 


3399 


cttttttgagataaccgtg 


11545 


11564QPO 


in NO' 


4663cacggagUactgaaaaag 


13723 


13742 




4 


QPO in NO 


3393 


QctaaaaUgtcattcctt 


11735 


[1754sPO 


in Mo< 

IL/ INS/. 


4664aaggcetctccacctcagc 


12102 


12121 




4 


QPO in NO< 


339^ 


gtgtataatgccacttgga 


11795' 


)1814afo 


in NO' 


4665tccaa9atgagatcaacac 


13104 


13123 




4 


QPO in MO 


339K 


attocacatacagctcaac 


11859' 


lie78QPn 


in NO- 
IL/ INWi 


4666gttQagaagccccaagaat 


6254 


6273 




4 


QFO in NO 


339fl 
OO^u 


tgaagaagatggcaaattt 


11992' 


12011 QFO 


in NO' 

IL/ MV/. 


4667aaattctcttncttttca 


0220 


9239 




4 


QPO in NO 


3*^97 
O0& r 


atcaaaagcccagcgttca 


12050 


12069QPn 


in NO' 


4668tgaaagtcaagcatctgat 


12669 


12688 




4 


QPO in NO 


339fi 


gtgggcatggatatggatg 


12143 


12182SFQ 


in NO- 


4669catccttaacaccttccac 


8071 


8090 




4 


0 in NO 


3330 


aaatggaacttctactaca 


12179' 


12198sFa 


ID NO 


4670tgtaccataagccatattt 


10088 


10107 




4 


QPO in NO 


3330 


aaaaacteaccatattcaa 


12219' 


12238cFo 


in NO' 


4671ttgatgttagagtgcmt 


6993 


7012 




4 


RFO in NO 


3331 


ctgagaagaaatdgcaga 


12426' 


12447 SEQ 


ID NO' 


4672tc(gc8cagaaatattcag 


13447 


13466 




4 


QPO in NO 


3339 


acaatgctgagtgggttta 


12447' 


12466SFD 


in NO- 
IL/ INV/. 


4673taaatggagtctttattgt 


14086 


14105 




4 


ceo in MO" 




caatgctgagtgggtttat 


12448 


12487sFn 


ID NO' 

IL/ INV/< 


4674 ataaatggagtctttattg 


14085 


14104 




4 


f^FO in NO 


333A 


ttaggcaaattgatgatat 


12477- 


12496SFO 


in NO- 

lU INV/i 


4675 atattgtcagtgcctctaa 


13392 


13411 




4 


QPO in NO 


333*% 
0003 


ataaactaatagatgtaat 


12897' 


1291 6 SFn 


ID NO- 

Iv INWi 


4676 attactatgaaaaatttat 


13641 


13660 




4 


QPO in NO 


333ft 
0000 


ccaactaatagaagataac 


13039 


13058QFO 


in NO- 
IL/ INV/i 


4677 gttattttgctaaacttgg 


14052 


14071 




4 


SFO in NO 


3337 


ttaattatatccaagatga 


13095 


131145:Fn 


ID NO 

IL/ INV/. 


4678 tcatcctctaattttttaa 


13800 


13819 




4 


QPO in MO 


0000 


tttaaattgttg aaagaaa 


13161 ' 


131 70 QPO 


in MO- 
IL' Vt\J, 


4679tttcatttg aaagaataaa 


7032 


7051 




4 


QPO in NO 


3330 


aaattcaataaatttattc 


13190 


13209qfo 


in NO 

IL/ lNV/< 


4660 gaataccaatgctgaactt 


10168 


10187 




4 


QPO in MO 




ttaaaaaaaaQ ataglcao 


13326 


13346 QPO 

OCVil 


in NO 

IL/ InV/ 


4681 ctgagagaagtgtcttcaa 


12407 


12426 




4 


RPO in NO 


33A1 


acttocattctgaatatat 


13377 


13398SFO 


ID NO 

IW INV/ 


4682 atatctggaaccttgaagt 


10737 


10756 




4 


RFO in NO 


334,9 


cacagaaatattcaggaat 


13451 


13470sEO 


ID NO 

11/ INV/ 


4683 attccctgaagttgatgtg 


11488 


11507 




4 


opo in MO 




ccattgcgacgaagaaaat 


13560 


13579SFO 


ID NO 

IL/ INV/ 


4684atttttattcx:tgccatgg 


10103 


10122 




4 


QPO in MO- 




tataaactgcaagattttt 


13607 


13626qfo 


in NO 


4685 aaaaltcaaactgcctata 


13873 


13892 




4 


QPO in MO 




tctgattactatgaaaaat 


13637 


13656SFO 


in NO 

IL/ INV/ 


4686 atttgtaagaaaatacaga 


6436 


6455 




4 


epo in NO 
wcw IL/ n\j 




ggagltactgaaaaagctg 


13728 


13745qfo 


in NO 

11/ INV/ 


4687 cagcatgcctagtttctcc 


9952 


9971 




4 


QPO in NO< 


33A7 


taaaccUgctccagaaga 


13773 


13792QFO 


ID MO- 
IL/ INV/ 1 


4688 tutcctltcttcatcUca 


10213 


10232 




4 


QPO in NO< 


33Aft 
00*rO 


tgaactggacdgcaocaa 


13955' 


13974 QPO 


in MO- 
IL/ Vi\J, 


4689ttggtagagcaagggtfca 


7866 


7876 




4 


QPO in NO 


334Q 


ttgctaaacttgggggagg 


14058 ' 


14077fti=o 


in NO' 

11/ InV/. 


4690cctcctacagtggtggcaa 


4230 


4249 




4 


QPO in MO 




qattcgaatatcaaattea 


4412 


4431 QPO 


m MO- 
IL/ lNV/« 


4691 tgaaaacgacaaagcaatc 


9603 


9622 


3 


3 


QPO in wo 


33K1 
000 1 


atttgtttgtcaaagaagt 


4551 


4670 QPO 


in NO- 
IL/ iNV/i 


4692 acttttctaaacttgaaat 


9063 


9082 


3 


3 


SEQ ID NO 


; 3352 


tdcggttgctgccgctga 


33 


52SEQ 


ID NO: 


4693 tcagocx:agccatttgaga 


9236 


9255 


2 


3 


SEQ ID NO 


: 3353 


gctgaggagcccgcccagc 


47 


66 SEQ 


ID NO: 


4694gctggatgtaaocaccagc 


11185 


11204 


2 


3 


SEQ ID NO 


: 3354 


ctggtctgtccaaaagatg 


227 


246 SEQ 


ID NO; 


4695catcagaaccattgaccag 


8134 


8153 


2 


3 


SEQ ID NO 


I 3355 


ctgagagttccagtggagt 


291 


310SEQ 


ID NO: 


4698 actcaatggtgaaattcag 


7465 


7484 


2 


3 


SEQ ID NO 


3356 


cagtgcaccdgaaagagg 


404 


423 SEQ 


ID NO: 


4697octoacttoctttggactg 


8977 


8096 


2 


3 


SEQ ID NO 


3357 


dctgaggagtttgctgca 


472 


491 SEQ 


ID NO, 


4698 tgcaaacttgacttcagag 


11399 


11418 


2 


3 


SEQ ID NO 


3358 


acatcaagaggggcatcat 


582 


601 SEQ 


ID NO 


4699 atgacgttcttgagcatgt 


7050 


7069 


2 


3 


SEQ ID NO 


3359 


ctgatcagcagcagccagt 


830 


849 SEQ 


ID NO 


4700 actggacttctctagtcag 


8809 


8828 


2 


3 


SEQ ID NO 


: 3360 


ggacgctaagaggaagcat 


865 


884SEQ 


ID NO 


4701 atgcctacgttccatgtcc 


11354 


11373 


2 


3 


SEQ ID NO 


: 3361 


agctgttttgaagactctc 


1087 


1106 SEQ 


ID NO 


4702 gagaagtgtcttcaaagct 


12411 


12430 


2 


3 


SEQ ID NO 


3362 


tgaaaaaactaaccatctc 


1113 


1132SEQ 


ID NO, 


4703 gagatcaacBcaatcttca 


13112 


13131 


2 


3 
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SEQ ID NO: 


3363 


ctgagctgagaggcctcag 


1176 1195SEQIDN0 


I: 4704ctgaattactgcacctoag 


3035 


3054 


2 


3 


SEQ ID NO: 


3364 


tgaaacgtgtgcatgccaa 


1311 


1330SEQ ID NO 


I; 4705ttggtagagcaagggttca 


7856 


7875 


2 


3 


SEQ ID NO: 


3365 


ccttgtatgcgctgagcca 


1440 


1459SEQ ID NO 


)• 4706tggcactgtttggagaagg 


9138 


9157 


2 


3 


SEQ ID NO: 


3366 


aggagctgctggacattgc 


1500 


1519SEQ ID NO 


i: 4707gcaagtcagcccagttcct 


10928 


10947 


2 


3 


SEQ ID NO: 


3367 


atttgattctgcgggtcat 


1575 


1594SEQ ID NO 


4708atgaaaccaatgaQaaaat 


7428 


7447 


2 


3 


SEQ ID NO: 


3368 


tocagaactcaagtcttca 


1627 


1646SEQ ID NO 


4709tgaaatacaatgctctgga 


5520 


5539 


2 


3 


SEQ ID NO: 


3369 


ggttcttcttcagactttc 


1744 


''763SEQ ID NO 


471 Ogaaataccaagtcaaaacc 


10455 


10474 


2 


3 


SEQ (D NO: 


3370 


gttgatgaggagtccttca 


1810 


1629SEQ ID NO 


471 1 tgaaaaagctgcaatcaac 


13734 


13753 


2 


3 


SEQ ID NO: 


3371 


tccaagatctgaaaaagtt 


1941 


19eOsEQ ID NO 


471 2aactgcttctccaaatgga 


3552 


3571 


2 


3 


SEQ ID NO: 


3372 


agttagtgaaagaagttct 


1956 


1975SEQ ID NO 


471 3agaattcataatcccaact 


8275 


8294 


2 


3 


SEQ ID NO: 


3373 


gaagggaatcttatatttg 


2084 


2103SEQ ID NO 


471 4caaaacctactgtctcttc 


10467 


10486 


2 


3 


SEQ ID NO: 


3374 


ggaagctctttttgggaag 


2221 


2240SEQ ID NO 


471 Scttcacataccagaattcc 


8324 


8343 


2 


3 


SEQ ID NO: 


3376 


tggaataatgctcaglgtt 


2374 


2393SEQ ID NO 


; 4716aacaaacacaggcattcca 


9656 


9675 


2 


3 


SEQ ID NO: 


3376 


gatttgaaatccaaagaag 


2408 


2427SEQ ID NO 


471 7cttcatgtccctagaaatc 


10037 


10056 


2 


3 


SEQ ID NO: 


3377 


tccaaagaagtcccggaag 


2417 


2436SEQ ID NO 


471 Sdtcagoctgcttfctgga 


4951 


4970 


2 


3 


SEQ ID NO: 


3378 


aggaagggctcaaagaatg 


2570 


2589SEQ ID NO 


471 9cattagagctgccagtcct 


10020 


10039 


2 


3 


SEQ ID NO: 


3379 


agaatgacttttttcttca 


2583 


2602SEQ ID NO 


4720tgaagatgacgacttttct 


12160 


12179 


2 


3 


SEQ ID NO: 


3380 


tttgtgacaaatatgggca 


2766 


2784SEQ ID NO 


4721 tgccagtttgaaaaacaaa 


11815 


11834 


2 


3 


SEQ ID NO: 


3381 


ctgaggctaccatgacatt 


3252 


3271 SEQ ID NO 


4722aatgtcagctcttgttcag 


10903 


10922 


2 


3 


SEQ ID NO: 


3382 


giagataccaaaaaaatga 


3668 


3687SEQ ID NO 


4723tcatttgccctcaaoctac 


11450 


11469 


2 


3 


SEQ ID NO: 


3383 


aaafgacttccaatttccc 


3661 


3700SEQ ID NO 


4724gggaaclgttgaaagattt 


12927 


12946 


2 


3 


SEQ ID NO: 


3384 


atgacttccaatttocctg 


3683 


3702SEQ ID NO 


4725caggagaacttactatcat 


13785 


13804 


2 


3 


SEQ ID NO: 


3386 


atctgccatctcgagagtt 


4104 


41 23 SEQ ID NO 


4726aactoctecactgaaagat 


9547 


9566 


2 


3 


SEQ ID NO: 


3386 


AA A - AAA A a 

atttgtttgtcaaagaagt 


4551 


4570SEQ ID NO 


4727acttocgtttaccagaaat 


8247 


8266 


2 


3 


SEQ ID NO: 


3387 


gcagagcttggcctctctg 


6135 


5154SEQ ID NO 


4728cagagctttctgccactgc 


13518 


13537 


2 


3 


SEQ ID NO: 


3388 


AAA A ■ ■ ■ 

atatgctgaaatgaaattt 


5353 


5372SEQ ID NO 


4729 aaattcaaactgcctatat 


13874 


13893 


2 


3 


SEQ ID NO: 


3389 


tcaaaacttgacaacattt 


5420 


5439SEQ ID NO. 


4730 aaatacttccacaaattga 


8780 


8799 


2 


3 


SEQ ID NO: 


3390 


M » Urn 

cagtgaoctgaaatacaat 


5512 


5531 SEQ ID NO: 


4731 attgaacatccccaaactg 


8794 


8813 


2 


3 


SEQ ID NO: 


3391 


tacaaatggcaatgggaaa 


5848 


5867SEQ ID NO: 


4732tttcaactgcctttgtgta 


11229 


11248 


2 


3 


SEQ ID NO: 


3392 


cttttgtaaagtatgataa 


6285 


6304SEQ ID NO: 


4733ttattgctgaatccaaaag 


13656 


13675 


2 


3 


SEQ ID NO: 


3393 


I 1 A AAA 

ttgtaaagtatgataaaaa 


6288 


6307SEQ ID NO: 


4734ttttcaagcaaatgcacaa 


8539 


8558 


2 


3 


SEQ ID NO: 


3394 


A. ^ A A ■ aahd 

tccattaacctcccatttt 


6320 


6339SEQ ID NO: 


4735 aaaagaaaattttgctgga 


10756 


10775 


2 


3 


SEQ ID NO: 


3395 


AA A. A. J • II 

gattatctgaattcattca 


6488 


6507SEQ ID NO: 


4736tgaagtagaccaacaaatc 


7162 


7181 


2 


3 


SEQ ID NO: 


3396 


Ah mam 

aattgggagagacaagttt 


6506 


6525SEQ ID NO: 


4737aaactaaa1gatctaaatt 


11324 


11343 


2 


3 


SEQ ID NO: 


3397 


^ AAA m. A mm 

atttgaaaatagctattgc 


6696 


6715SEQ ID NO: 


4738 gcaatttctgcacagaaat 


13441 


13460 


2 


3 


SEQ ID NO: 


3398 


A A A ■ i ■ 

tgagcatgtcaaacacttt 


7060 


7079SEQ ID NO: 


4739aaagccattcagtctctca 


12971 


12990 


2 


3 


SEQ ID NO: 


3399 


ttgaagatgttaacaaatt 


7356 


7375SEQ ID NO: 


4740 aattocatatgaaagtcaa 


12660 


12679 


2 


3 


SEQ ID NO: 


3400 


acttgtcacctacatttct 


7753 


7772SEQ ID NO: 


4741 agaatattttgatccaagt 


13276 


13295 


2 


3 


SEQ ID NO: 


3401 


AA AA ■ « ■ 

gttttccacaccagaattt 


8050 


8069SEQ ID NO: 


4742 aaatctggattlcttaaac 


9481 


9500 


2 


3 


SEQ ID NO: 


3402 


ataagtacaaccaaaattt 


9405 


9424SEQ ID NO: 


4743 aaataaatggagtctttat 


14083 


14102 


2 


3 


SEQ ID NO: 


3403 


cgggacctgcggggctgag 


8 


27sEQ ID NO: 


4744 ctcagttaactgtgtccog 


11571 


11690 


1 


3 


SEQ ID NO: 


3404 


agtgcccttctcggttgct 


25 


44SEQ ID NO: 


4745 agcatotgattgactcact 


12678 


12697 


1 


3 


SEQ ID NO: 


3405 


gctgaggagcccgcccagc 


47 


66SEQ ID NO: 


4746gctgattgaggtgtccagc 


1225 


1244 


1 


3 


SEQ ID NO: 


3406 


gaggagcccgcocagccag 


50 


69sEQ ID NO: 


4747ctggatcacagagtccctc 


3752 


3771 


1 


3 


SEQ ID NO: 


3407 


gggccgcgaggccgaggcc 


72 


91 SEQ ID NO: 


4748 ggccctgatccccgagccc 


1363 


1382 


1 


3 


SEQ ID NO: 


3408 


ccaggccgcagcccaggag 


89 


108QEQ ID NO: 


4749 ctcccggagccaaggctgg 


2662 


2701 


1 


3 


SEQ ID NO: 


3409 


ggagccgccccaccgcagc 


104 


123SEQ ID NO: 


4750 gctgttttgaagactctcc 


1088 


1107 


1 


3 


SEQ ID NO: 


3410 


gaagaggaaatgctggaaa 


200 


219SEQ ID NO: 


4751 tttcaagttcctgaoctto 


8309 


8328 


1 


3 


SEQ ID NO: 


3411 


caaaagatgcgacccgatt 


237 


256SEQ ID NO: 


4752aatcttatlggggattttg 


7085 


7104 


1 


3 


SEQ ID NO: 


3412 


attcaagcacctccggaag 


253 


272SEQ ID NO: 


4753 cttccacatttcaaggaat 


10067 


10086 


1 


3 
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SEQ ID NO: 


3413 


gttccagtggagtccctgg 


297 


316sEQ ID NO 


4754ccagcaagtacctgagaac 


8610 


8629 


1 3 


SEQ ID NO: 


3414 


gactgctgattcaagaagt 


316 


335sEQ ID NO 


: 4755 acttgaagaaaagatagtc 


13324 


13343 


1 3 


SEQ ID NO: 


3415 


gtgccaccaggatcaactg 


333 


SEQ ID NO 


4756cagtgaagctgcagggcac 


10704 


10723 


1 3 


SEQ ID NO: 


3416 


gatcaactgcaaggttgag 


343 


362 SEQ ID NO 


4757ctcacctccaoctctgatc 


4748 


4767 


1 3 


SEQ ID NO: 


3417 


actgcaaggttgagctgga 


348 


367sEQ ID NO 


4758focactcacatectccagt 


1289 


1308 


1 3 


SEQ ID NO: 


3418 


ccagctctgcagcttcatc 


373 


392 SEQ ID NO 


4759 gatgtggtcacctacctgg 


1343 


1362 


1 3 


SEQ ID NO: 


3419 


agcttcatcctgaagacca 


383 


402 SEQ ID NO 


: 4760tggtgctggagaatgagcl 


5112 


5131 


1 3 


SEQ ID NO: 


3420 


cttcatcctgaagaccagc 


385 


404SEQ ID NO 


4761 gctggagtaaaactggaag 


2696 


2715 


1 3 


SEQ ID NO: 


3421 


ccagccagtgcaccdgaa 


399 


418sEQ ID NO 


4762ttcaagatgactgcactgg 


1539 


1658 


1 3 


SEQ ID NO: 


3422 


cagtgcacGctgaaagagg 


404 


423sEQ ID NO 


4763cctcacagagctatcactg 


5230 


5249 


1 3 


SEQ ID NO: 


3423 


tggcttcaaccctgagggc 


427 


446SEQ ID NO 


I 4764gcccactggtcgcctgcca 


3533 


3552 


1 3 


SEQ ID NO: 


3424 


cttcaaccctgagggcaaa 


430 


"^^^SEQ ID NO 


4765tttgagccaacattggaag 


2207 


2226 


1 3 


SEQ ID NO: 


3425 


ttcaaccctgagggcaaag 


431 


460sEQ ID NO 


4766ctttgacaggcattttgaa 


9727 


9746 


1 3 


SEQ ID NO: 


3426 


cttgctgaagaaaaccaag 


451 


470sEQ ID NO 


4767cttgaaattcaatcacaag 


9074 


9093 ' 


1 3 


SEQ ID NO: 


3427 


tgctgaagaaaacx^agaa 


453 


^72 SEQ ID NO 


4768 ttctgctgccttatcagca 


5647 


5666 


1 3 


SEQ ID NO: 


3428 


tlgctgcagccatgtccag 


483 


502SEQ ID NO 


4769 ctggtcagtttgcaagcaa 


3004 


3023 ■ 


1 3 


SEQ ID NO: 


3429 


tgctgcagccatgtccagg 


484 


603 SEQ ID NO 


4770cctggtcagtttgcaagca 


3003 


3022 ' 


1 3 


SEQ ID NO: 


3430 


agccatgtccaggtatgag 


490 


608SEQ ID NO 


4771 ctcacatcctccagtggct 


1293 


1312 ' 


1 3 


SEQ ID NO: 


3431 


agctcaagctggccattcc 


507 


526SEQ ID NO 


4772 ggaactaccacaaaaagct 


7489 


7508 


1 3 


SEQ ID NO: 


3432 


agaagggaagcaggltttc 


526 


545SEQ ID NO 


4773gaaatcttcaatttattct 


13821 


13840 ' 


1 3 


SEQ ID NO: 


3433 


aagggaagcaggttttcct 


528 


^7 SEQ ID NO 


4774aggacaccaaaataacctt 


7672 


7591 


1 3 


SEQ ID NO: 


3434 


agaaagatgaacctactta 


555 


S74SEQ ID NO 


4775taagaactttgGcacttct 


4852 


4871 


1 3 


SEQ ID NO: 


3435 


atcctgaacatcaagaggg 


575 


594 SEQ ID NO 


4776ccctaacagamgaggat 


7977 


7996 ' 


I 3 


SEQ ID NO: 


3436 


tcctgaacatcaagagggg 


576 


595 SEQ ID NO 


4777cccctaacagatttgagga 


7976 


7995 


1 3 


SEQ ID NO: 


3437 


ctgaacatcaagaggggca 


578 


SEQ ID NO 


4778 tgcctgcctttgaagtcag 


7908 


7927 1 


1 3 


SEQ ID NO: 


3438 


aacatcaagaggggcatca 


681. 


600 SEQ ID NO 


477gtgataaaaaccaagatgtt 


6298 


8317 1 


1 3 


SEQ ID NO: 


3439 


acatcaagaggggcatcat 


582 


601 SEQ ID NO, 


4780atgataaaaaccaagatgt 


6297 


6316 1 


1 3 


SEQ ID NO: 


3440 


tcatttctgccctcctggt 


597 


616SEQ ID NO. 


4781 accaccagtttgtagatga 


7413 


7432 ' 


1 3 


SEQ ID NO: 


3441 


ttcccccagagacagaaga 


615 


634sEQ ID NO; 


4782 tcttccacatttcaaggaa 


10066 


1 0085 1 


1 3 


SEQ ID NO: 


3442 


gaagaagccaagcaagtgt 


629 


648SEQ ID NO: 


4783acaccttccacattccttc 


8079 


8098 1 


1 3 


SEQ ID NO: 


3443 


ttgtttctggataccgtgt 


647 


866SEQ ID NO; 


4784acactaaatacttccacaa 


8775 


8794 1 


1 3 


SEQ ID NO: 


3444 


tgtatggaaactgctccac 


663 


682sEQ ID NO; 


4785gtggaggcaacacattaca 


2928 


2947 1 


1 3 


SEQ ID NO: 


3445 


aaactgctccactcacttt 


670 


589SEQ ID NO: 


4786aaagaaacagcatttgttt 


4540 


4559 1 


3 


SEQ ID NO: 


3446 


actcactttaccgtcaaga 


680 


699SEQ ID NO: 


4787tcttacttttccattgagt 


10580 


10599 1 


3 


SEQ ID NO: 


3447 


ctttaccgtcaagacgagg 


685 


704SEQ ID NO: 


4788cctccagctc»tgggaaag 


2491 


2610 1 


3 


SEQ ID NO: 


3448 


ttaocgtcaagacgaggaa 


687 


706SEQ ID NO: 


4789ttcctaaagctggat9taa 


11177 


11196 1 


3 


SEQ ID NO; 


3449 


acgaggaagggcaatgtgg 


698 


717SEQ ID NO: 


4790ccacaagtcatcatctcgt 


6964 


5983 1 


3 


SEQ ID NO: 


3450 


cgaggaagggcaatgtggc 


699 


718SEQ1DNO: 


4791 gccagaagtgagatcctcg 


3515 


3534 1 


3 


SEQ ID NO: 


3451 


gaggaagggcaatgtggca 


700 


719SEQ ID NO: 


4792tgccagtctccatgacctc 


2476 


2495 1 


3 


SEQ ID NO: 


3452 


ggaagggcaatgtggcaac 


702 


721 SEQ ID NO: 


4793gttgctcttaaggactta; 


13364 


13383 1 


3 


SEQ ID NO: 


3453 


gaagggcaatgtggcaaca 


703 


722 SEQ ID NO* 


4794tgttgatgaggagtccltc 


1809 


1828 1 


3 


SEQ ID NO: 


3454 


caggcatcagcccacttgc 


777 


798SEQ ID NO: 


4795gcaagtctttcctggcctg 


3019 


3038 1 


3 


SEQ ID NO: 


3455 


aggcatcagcccacttgct 


778 


797SEQ ID NO: 


4796agcaagtctltcctggcct 


3018 


3037 1 


3 


SEQ ID NO: 


3456 


tcagcccacttgctctcat 


783 


802SEQ ID NO: 


4797atgaaagtcaagcatctga 


12668 


12687 1 


3 


SEQ ID NO: 


3457 


gtcaactctgatcagcagc 


623 


M2sEQ ID NO: 


4798gctgac^ttaaaatctgac 


4819 


4838 1 


3 


SEQ ID NO: 


3458 


ggacgctaagaggaagcat 


865 


884SEQ ID NO: 


4799atgcactgtttctgagtcc 


9339 


9358 1 


3 


SEQ ID NO: 


3459 


aaggagcaacacctcttcc 


902 


921 SEQ ID NO: 


4800ggaatatcttagcatcctt 


13465 


13484 1 


3 


SEQ ID NO: 


3460 


aggagcaacacctcttcct 


903 


922sEQ ID NO: 


4801 aggaatatcttagcatcct 


13464 


13483 1 


3 


SEQ ID NO: 


3461 


caacacctcttoctgcctt 


908 


927SEQ ID NO: 


4802aaggctgactctgtggttg 


4292 


4311 1 


3 


SEQ ID NO: 


3462 


aacacctcttcctgcctit 


909 


928SEQ ID NO: 


4803aaagcaggccgaagctgtt 


1076 


1094 1 


3 
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040*7 


acaaaaataagtatcicigat 


933 


952 ftp A in NO' 

11^ INW* 


4804atccatgatctacatttgt 


6794 


6813 1 


3 


ocQ ID NO: 


04o4 


caaoaataacitataaciatQ 


934 


953SEQ ID NO: 


4805catGactttacaagccttg 


1246 


1266 1 


3 


OcU lU NU. 


Q/QE 


tacicacaaataacacaaac 


954 


973SEQ ID NO: 


4606gtctcttcgttctatgcta 


4592 


4611 1 


3 


ocQ ID NO. 


OHOO 


aQcacaaata acacaa act 


955 


974SEQ ID NO: 


4807agtctcttegttctatgct 


4591 


4610 1 


3 


SEQ ID NO: 




acacaaotaacacaciactt 


956 


975SEQ ID NO: 


4808 aagtgtagtctcctggtgc 


5099 


5118 1 


3 


obU lU NO. 




aactteiaaciacacaccaaa 


978 


997 SEQ ID NO: 


4809tttgaggaUccatcagtt 


7987 


8006 1 


3 


cci^ in MO- 
obU lU riV, 




acttctttQafqaaagtac 


1008 


1027SEQ ID NO: 


4810gtacctacttttggcaagc 


8372 


8391 1 


3 


SEQ ID NO. 




ctttacrtcaa oatactaaa 


1012 


1031 SEQ ID NO: 


481 1 dtatgggatttcctaaag 


11167 


11186 1 


3 


SEQ ID NO. 




tactaaa aaaataaacctc 


1024 


1043 SEQ ID NO: 


4812gagggtagtcataacagta 


10337 


10356 1 


3 


ScQ ID NO. 




tttaaaaacaccaaatcca 


1046 


1065 SEQ ID NO: 


481 3tggaagtgtcagtggcaaa 


10380 


10399 1 


3 


obU lU NO. 




agagcaocaaatccacatc 


1060 1069SEQIDNO: 


481 4gatggatatgaccttctct 


4876 


4895 1 


3 


SEQ ID NO. 




aQctQttttciaaaactctc 


1087 IIOBsEQIDNO: 


48l5gagaacatactgggcagct 


5880 


5899 1 


3 


oEQ ID NO. 


OH/O 


tQ a aa aaactaa ccatctc 


1113 


1132SEQ ID NO: 


481 Ogagaaaatcaatgccttca 


7112 


7131 1 


3 


oEQ ID NO. 


04r0 


aaaaaaactaaccatctct 


1114 


1133SEQ ID NO: 


481 7 agagccaggtcgagctttc 


11052 


11071 1 


3 


ocQ ID NO. 


o4/ / 


tctaaQcaaaatatccacia 


1130 


1149SEQ ID NO: 


481 Stctgatgaggaaactcaga 


12260 


12279 1 


3 


ocQ ID no: 




tctcttcaataaactaatt 


1166 


1175SEQ ID NO: 


481 9aacctcccattttttgaga 


6326 


6345 1 


1 3 


ohU lU NO. 


1A7Q 


ctaaactaaaaaacctcaQ 


1176 


1195SEQ ID NO: 


4820ctgatccccgagccctcag 


1367 


1386 1 


1 3 


OcQ ID NO. 


o*toU 


ta aa QcaDtcacatctctc 


1198 


1217SEQ ID NO: 


4821 gagaaaatcaatgccttca 


7112 


7131 1 


1 3 


ocU ID NO. 


0401 


aa Q caatcacatctctctt 


1200 


1219SEQ ID NO: 


4822 aagaggcagcttctggctt 


12297 


12316 ' 


1 3 


otU ID NO. 


«34oZ 


ctctcttaccacaactgat 


1212 


1231 SEQ ID NO: 


4823atcaaaagaagcccaagag 


12946 


12965 - 


1 3 


obu ID no: 


o4oo 


tcttaccacacictaattaa 


1215 


1234SEQ ID NO 


4824tcaaagttaattgggaaga ' 


12279 


12298 - 


1 3 


OCA in Mrt* 
obU lU NO. 


040** 


cttoccacagctq attaaa 


1216 


1235SEQ ID NO 


4825ctcaattttgattttcaag 


8528 


8647 ' 


1 3 


obU lU NU. 


0400 


taaa Qtqtccaciccocatc 


1231 


1250SEQIDNO 


4826gatggaaccctctccctca 

w w 


4733 


4762 1 


1 3 


ceo m KIA* 
obU ID NO. 


04OO 


tcaatataoacaacctcag 


1287 


1288SEQ ID NO 


4827ctgacatcttaggcactga 


5001 


5020 ' 


1 3 


CCA lA MA« 

obU ID NO. 




acatcctccaataactgaa 


1296 


1315SEQ ID NO 


4828ttcagaagctaagcaatgt 


7239 


7258 ' 


1 3 


OCA lA KIA- 

obU lU NO. 




acacaacagctciociagaQa 


1385 


1404SEQ ID NO 


4829tctctgaaagacaacgtgc 


12323 


12342 ' 


1 3 


CCA lA MA« 

obQ ID NO. 


lAQQ 


caacaactacaaQ aciatct 


1388 


1407SEQ ID NO 


4830agataacattaaacagctg 


13051 


13070 ' 


1 3 


CCA lA MA» 

obO lU NO. 


049U 


acaaaqqatcaQCQcaQCC 


1415 


1434SEQ ID NO 


4831 ggctcaacacagacatcgc 


5718 


5737 


1 3 


CCA in KIA* 
obQ ID NO. 


049l 


aagacaaaccctacaggga 


1478 


1497SEQ ID NO 


4832tcccagaaaacctcttctt 


3936 


3955 ' 


t 3 


CCA in MA« 

obU lU NO. 


o4y^ 


caaaaactactaaacatta 


1499 1518SEQIDNO 


4833caatggagagtccaacctg 


4660 


4679 ' 


1 3 


CCA in MA" 


lAO'X 


aaqaqctQctqaacattqc 


1500 


1519SEQ ID NO 


4834gcaagggttcactgttcct 


7864 


7883 


1 3 


OCA in KIA. 

oEQ ID NO: 


o4y4 


ctqctqqacattqctaatt 


1505 


1524SEQ ID NO 


4835aattgggaagaagaggcag 


12287 


12306 ' 


1 3 


CCA in MA» 

obU ID NO. 


o4yo 


qattacacctatttaattc 


1565 


1584SEQ ID NO 


4836gaatattttgagaggaatc 


6353 


6372 


1 3 


CCA lA KIA* 

ObU ID NO. 




atttaattctgcgqqtcat 


1575 


1594SEQ ID NO 


4837atgaagtagaccaacaaat 


7161 


7180 


1 3 


OCA in MA. 
ObU lU NO. 


o4y f 


tctacggqtcattggaaat 


1582 


1601 SEQ ID NO 


4838atttgtaagaaaatacaga 


6436 


6455 


1 3 


OCA m MA> 

obU ID NO. 


j4yo 


aaccatqq a q ca qttaact 


1609 1628SEQIDNO 


4839agtttotccatcctaggti 


9962 


9981 


1 3 


OCA in MA» 

obQ ID NO. 


o4yy 


qqaqcaqttaactccaqaa 


1615 


1634SEQ ID NO 


4840ttctgaaaatGcaatctcc 

f w 


8400 


8419 


1 3 


OCA in klA> 
ObU ID NO. 


oOUU 


actccagaactcaagtctt 


1625 


1644SEQ ID NO 


; 4841aagatcgcagaGtttgagt 


11654 


11673 


1 3 


CCA in KiO« 

ocU lU NU. 


OOUl 


tocaqaactcaagtcttca 


1627 


1646SEQ ID NO 


4842tgaactcagaagaattgga 


1920 


1939 


1 3 


CCA in MA« 

ObU lO NO. 




aagtacaaagccatcactg 


1663 


1682SEQ ID NO 


4843cagtcatgtagaaaaactt 


4429 


4448 


1 3 


OCA in MA. 


3503 


gccatcadgatg atccag 


1672 


1691 SEQ ID NO 


4844ctggaactctctccatggc 


10883 


10902 


1 3 


SEQ ID NO: 


3504 


ccatcactgatgatccaga 


1673 


1692 SEQ ID NO 


. 4845tctgaactcagaaggatgg 


13999 


14018 


1 3 


SEQ ID NO: 


3505 


atccagaaagctgccatcc 


1685 


1704SEQ ID NO 


: 4846ggatttcciaaagctggat 


11173 


11192 


1 3 


SEQ ID NO: 


3506 


cagaaagctgccatocagg 


1688 


1707 SEQ ID NO 


4847cctgaaatacaalgctctg 


5518 


5537 


1 3 


SEQ ID NO: 


3507 


acaaggaccaggaggttct 


1731 


1750SEQ ID NO 


4848agaaacagcatttgtttgt 


4542 


4561 


1 3 


SEQ ID NO: 


3508 


aggaccaggaggttcttct 


1734 


1753SEQ ID NO 


4849 agaagctaagcaatgtcct 


7242 


7261 


1 3 


SEQ ID NO: 


3509 


accaggaggttcttcttca 


1737 


1756 SEQ ID NO 


. 4850tgaaggctgactctgtggt 


4290 


4309 


1 3 


SEQ ID NO: 


3510 


tcttcagactttccttgat 


1750 


1769 SEQ ID NO 


. 4851 atcaggaagggctcaaaga 


2567 


2586 


1 3 


SEQ ID NO: 


3511 


ttcagactttccttgatga 


1752 


1771 SEQ ID NO 


4852tcattactcctgggctgaa 


11307 


11326 


1 3 


SEQ ID NO: 


3512 


gttgatgaggagtccttca 


1810 


1829SEQ ID NO 


i. 4853tgaatotggctccctcaac 


9046 


9065 


1 3 
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SEQ 10 NO: 


3513 


cttcacaggcagatattaa 


1824 


1843gEQ ID nc 


): 4854ttaatcgagaggtatgaag 


7148 


7167 


1 3 


SEQ ID NO: 


3514 


ttcacaggcagatattaac 


1825 


1844SEQ ID NC 


): 4855gttaatcgagaggtatgaa 


7147 


7166 


1 3 


SEQ ID NO: 


3515 


ggcagatattaacaaaatt 


1831 


1850sEQ IDNC 


1: 4856aattgcattagatgatgcc 


6589 


6608 


1 3 


SEQ ID NO: 


3516 


atattaacaaaattgtcca 


1836 


'1855SEQ IDNC 


1: 4857tggagtttgtgacBaatat 


2760 


2770 


1 3 


SEQ ID NO: 


3517 


acaaaattgtccaaattct 


1842 


1861 SEQ ID NO 


I: 4858agaaacagcatttgtttgt 


4542 


4561 


1 3 


SEQ ID NO: 


3518 


gagcaagtgaagaactttg 


1877 


1896SEQ ID NQ 


). 4859caaatgacatgatgggctc 


5334 


5353 


1 3 


SEQ ID NO: 


3519 


gtgaagaactttgtggctt 


1883 


1902SEQ ID NO 


1: 4860aagcafctgattgactcac 


12677 


12696 


1 3 


SEQ ID NO: 


3520 


agaactttgtggcttccca 


1887 


1906SEQ ID NO 


I: 4861 tgggcctgccccagattct 


8909 


8928 


1 3 


SEQ ID NO: 


3521 


tttgtggcttcccatattg 


1892 


1611SEQIDN0 


: 4862caataagatcaatagcaaa 


8998 


9017 


1 3 


SEQ ID NO: 


3522 


tggcttcccatattgccaa 


1896 


'ISISsEQ ID NO 


4863ttggctcacatgaaggcca 


7631 


7650 


1 3 


SEQ ID NO: 


3523 


ttcccatattgccaatatc 


1900 


1919SEQ ID NO 


: 4864gatatacactagggaggaa 


12745 


12764 


1 3 


SEQ ID NO: 


3524 


tcocatattgccaatatct 


1901 


1920SEQIDNO 


4865 agatcaaagttaattggga 


12276 


12295 


1 3 


SEQ ID NO: 


3525 


ttgcx:aatatcttgaactc 


1908 


1927SEQ ID NO 


4866 gagtcccagtgcccagcaa 


9352 


9371 


1 3 


SEQ ID NO: 


3526 


ttggatatccaagatctga 


1934 


1953SEQ ID NO 


4867tcagtataagtacaaccaa 


9400 


9419 


1 3 


SEQ ID NO: 


3527 


tccaagatctgaaaaagtt 


1941 


1960SEQ ID NO 


4B68aacttccaactgtcatgga 


1986 


2005 


1 3 


SEQ ID NO: 


3528 


ctgaaaaagttagtgaaag 


1949 


1966 SEQ ID NO 


: 4869ctttgaagtcagtcttcag 


7915 


7934 


1 3 


SEQ ID NO: 


3529 


agttagtgaaagaagttct 


1956 


''975SEQ ID NO 


4870agaatctcaacttccaact 


1978 


1997 


1 3 


SEQ ID NO: 


3530 


aatctcaacttccaactgt 


1980 


1999SEQ ID NO 


4871 acaggggtccttta^att 


12350 


12369 


1 3 


SEQ ID NO: 


3531 


gtcatggacttcagaaaat 


1997 


2016SEQ ID NO 


4872atttgaaagaataaatgac 


7036 


7055 


1 3 


SEQ ID NO: 


3532 


tcaactctacaaatctgtt 


2029 


2048SEQ ID NO 


4873aacacattgaggctattga 


6978 


6997 


1 3 


SEQ ID NO: 


3533 


aactctacaaatctgtttc 


2031 


2050 SEQ ID NO 


4874gaaaaaggggattgaaglt 


10284 


10303 


1 3 


SEQ ID NO: 


3534 


aaatagaagggaatctlat 


2079 


2098SEQ ID NO 


4875ataagcaaactgttaattt 


6457 


5476 


1 3 


SEQ ID NO: 


3535 


agaagggaatcttatattt 


2083 


2102SEQ ID NO 


4876aaatgcactgctgcgttct 


4900 


4919 - 


1 3 


SEQ ID NO: 


3536 


gaagggaatcttatatttg 


2084 


2103SEQ ID NO 


4877caaaaacattttcaacttc 


5287 


5306 ' 


1 3 


SEQ ID NO: 


3537 


tgatccaaataactacctt 


2101 


2120SEQ ID NO 


4878aaggaagaaagaaaaatca 


3461 


3480 ' 


1 3 


SEQ ID NO: 


3538 


tggatttgcttcagctgac 


2158 


2177SEQ ID NO 


4879gtcagcccagttccttcca 


10932 


10951 ' 


1 3 


SEQ ID NO: 


3539 


tttgcttcagdgacctca 


2162 


21 81 SEQ ID NO 


4880tgaggaaactcagatcaaa 


12265 


12284 1 


1 3 


SEQ ID NO: 


3540 


cttggaaggaaaaggcm 


2191 


2210SEQ ID NO: 


4881 aaagcattggtagagcaag 


7850 


7869 1 


1 3 


SEQ ID NO: 


3541 


tggaaggaaaaggctttga 


2193 


2212SEQ ID NO: 


4882tcaagtctgtgggattcca 


4086 


4105 - 


I 3 


SEQ ID NO: 


3542 


^ A ft M i a 

ggctttgagccaacattgg 


2204 


2223SEQ ID NO: 


4883ccaagaggtatttaaagcc 


12958 


12977 1 


1 3 


SEQ ID NO: 


3543 


tgagccaacattggaagct 


2209 


2228SEQ ID NO: 


4884agcmctgccactgctca 


13521 


13540 1 


1 3 


SEQ ID NO: 


3544 


gagccaacattggaagctc 


2210 


2229SEQ ID NO: 


4885gagctttctgGcactgctc 


13520 


13539 1 


3 


SEQ ID NO: 


3545 


aacattggaagctcttttt 


2215 


2234SEQ ID NO: 


4886 aaaagaaacagcatttgtt 


4539 


4558 1 


3 


SEQ ID NO: 


3546 


tggaagctctttttgggaa 


2220 


2239SEQ ID NO: 


4887ttccggcacgtgggttcca 


3765 


3804 1 


3 


SEQ ID NO: 


3547 


ctctttttgggaagcaagg 


2226 


2245SEQ ID NO: 


4888 ccttactgactttgcagag 


7798 


7817 1 


3 


SEQ ID NO: 


3548 


tttttgggaagcaaggatt 


2229 


2248SEQ ID NO: 


4889aatcattgaaaaattaaaa 


6730 


8749 1 


3 


SEQ ID NO: 


3549 


J ft a a _^ MM 

ttttcccagacagtgtcaa 


2247 


2266SEQ ID NO: 


4890ttgatgaaatcattgaaaa 


6723 


6742 1 


3 


SEQ ID NO: 


3550 


ttggctataocaaagatga 


2331 


2350SEQ ID NO: 


4891 tcattgctcccggagccaa 


2676 


2695 1 


3 


SEQ ID NO: 


3551 


ataccaaagatgataaaca 


2337 


2356SEQ ID NO: 


4892tgttgcttttgtaaagtat 


6280 


6299 1 


3 


SEQ ID NO: 


3662 


gagcaggatatggtaaatg 


2367 


2376SEQ ID NO: 


4893 catttcagccttcgggcto 


4262 


4281 1 


3 


SEQ ID NO: 


3553 


- * — - .-^ ft ft a A 

atggtaaatggaataatgc 


2366 


2385SEQ ID NO: 


4894 gcatgcctagtttctccat 


9954 


9973 1 


3 


SEQ ID NO: 


3554 


tggtaaatggaataatgct 


2367 


2386SEQ ID NO: 


4895agcacagtacgaaaaaoca 


10609 


10828 1 


3 


SEQ ID NO: 


3555 


taaatggaataatgctcag 


2370 


2389SEQ ID NO: 


4896ctgaaagagatgaaattta 


13067 


13086 1 


3 


SEQ ID NO: 


3556 


tggaataatgctcagtgtt 


2374 


2393SEQ ID NO: 


48g7aacagalttgaggattcca 


7981 


8000 1 


3 


SEQ ID NO: 


3557 


tcagtgttgagaagctgat 


2385 


2404SEQ ID NO: 


4898 atcacaactcctccactga 


9542 


9561 1 


3 


SEQ ID NO: 


3558 


cagtgttgagaagctgatt 


2386 


2405SEQ ID NO: 


4899aatcacaactcctccactg 


9541 


9660 1 


3 


SEQ ID NO: 


3559 


agtgttgagaagctgatta 


2387 


2406SEQ ID NO: 


4900taatcacaactcctccact 


9540 


9559 1 


3 


SEQ ID NO: 


3660 


gattaaagatttgaaatcc 


2401 


2420 SEQ ID NO: 


4901 ggatactaagtaccaaatc 


6874 


6893 1 


3 


SEQ ID NO: 


3681 


gatttgaaatccaaagaag 


2408 


2427SEQ ID NO: 


4902 cttccgtttaccagaaatc 


8248 


8267 1 


3 


SEQ ID NO: 


3562 


atttgaaatccaaagaagt 


2409 


2428SEQ ID NO: 


4903 acttocgtttaccagaaat 


8247 


8266 1 


3 
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SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 



3563 

3564 

3565 

3566 

3567 

3568 

3569 

3570 

3571 

3572 

3573 

3574 

3576 

3576 

3577 

3578 

3679 

3560 

3581 

3582 

3583 

3584 

3685 

3586 

3587 

3588 

3589 

3590 

3591 

3592 

3593 

3594 

3595 

3596 

3597 

3598 

3599 

3600 

3601 

3602 

3603 

3604 

3605 

3606 

3607 

3608 

3609 

3610 

3611 

3612 



atccaaagaagtcccggaa 241 6 

tccaaagaagtcccggaag 241 7 

agagcctaoctccgcatct 2438 

gagcctacctcogcatctt 2439 

cttgggagaggagcttggt 2455 

ggagcttggftttgccagt 2464 

ttggttttgccagtctcca 2469 

cagtctccatgacctccag 2479 

ctccatgacctccagctcc 2483 

clgggaaagctgcttctga 2501 

gaggtcatcaggaagggct 2561 

aagaatgacttttttcttc 2582 

cttttttcttcactacatc 2590 

catdtcatggagaatgcc 2605 

cttcatggagaatgccttt 2608 

aatgcctttgaactcccca 2618 

gcctttgaactccccactg 2621 

caaggctggagtaaaactg 2692 

tggagtaaaactggaagta 2698 

ggaagtagccaacatgcag 271 0 

tttgtgacaaatatgggca 2765 

tgtgacaaatatgggcatc 2767 

ggacttcgctaggagtggg 2794 

gtggggtccagatgaacac 2808 

ttccacgagtcgggtctgg 2834 

agtcgggtctggaggctca 2641 

tcgggtctggaggctcatg 2843 

aaaagctgggaagctgaag 2869 

aagctgaagtttatcattc 2879 

gagaocagtcaagctgctc 2908 

gcaacacatiacatttggt 2934 

acattacatttggictcta 2939 

cattacatttggtctctac 2940 

aaacggaggtgatcccacc 2964 

attgagaacaggcagtcct 2987 

tgagaacaggcagtcctgg 2989 

ctgcaoctcaggcgcttac 3043 

tccacagactccgcctcct 3074 

ctgaccggggacaccagat 3101 

tagagctggaactgaggcc 31 20 

ctatgagctccagagagag 3175 

cttggtggataocctgaag 3202 

ttgtaactcaagcagaagg 3222 

taactcaagcagaaggtgc 3225 

gcagaaggtgcgaagcaga 3233 

cagaaggtgcgaagcagac 3234 

gtatgaccttgtccagtga 3288 

tatgaccttgtccagtgaa 3269 

gaagtccaaattccggatt 3305 

gagggcaaaacgtcttaca 3371 



2435 SEQ ID NO 
2436SEQ ID NO 
2457 SEQ ID NO 
2458 SEQ ID NO 
2474SEQ ID NO 
2483SEQ ID NO 
2488SEQ ID NO 
2468SEQ ID NO 
2502 SEQ ID NO 
2520 SEQ ID NO 
2580SEQ ID NO 
2601 SEQ ID NO 
2609SEQ ID NO 
2624SEQ ID NO 
2627SEQ ID NO 
2637SEQ ID NO 
2640SEQ ID NO 
2711 SEQ ID NO 
2717SEQ ID NO 
2729SEQ ID NO 
2784SEQ ID NO 
2786SEQ ID NO 
2813SEQ ID NO 
2827SEQ ID NO 
2853SEQ ID NO 
2880SEQ ID NO 
2862SEQ ID NO 
2888SEQ ID NO 
289BSEQ ID NO 
2927SEQ ID NO 
2053SEQ ID NO 
2958SEQ ID NO 
2959SEQ ID NO 
2983SEQ ID NO 
3006SEQ ID NO 
3008SEQ ID NO 
3062 SEQ ID NO 
3093 SEQ ID NO 
3120SEQ ID NO 
3139SEQ ID NO 
3194SEQIDNO 
3221 SEQ ID NO 
3241 SEQ ID NO 
3244SEQ ID NO 
3252SEQ ID NO 
3253 SEQ ID NO 
3307SEQ ID NO 
3308SEQ ID NO 
3324SEQ ID NO 
3390SEQ ID NO 



4904 ttccaatttccctgtggat 

4905cttccaatttocctgtgga 

4906agattaatccgctggctct 

4907aa9attaatccgctggctc 

4908 accactgggacctaccaag 

4909actg9tggcaaaaccctcc 

4910tggaoaagccacactccaa 

491 1 ctgglcgcctgccaaactg 

4912ggagtcattgctcccggag 

491 3tcagaaagctaccttccag 

4914agccagaag1gagatcctc 

491 Sgaaggcatctgggagtctt 

491 6 gatgcttacaacactaaag 

491 7ggcacttccaaaattgatg 

491 8 aaagttaattgggaagaag 

491 9tgggctggcttcagccatt 

4920cagtctgaacattgcaggc 

4921 cagtgcaacgaccaacltg 

4922tactccaacgccagctcca 

4923ctgccatctcgagagttoc 

4924t9cctttgtgtacaccaaa 

4925gatgggtctctacgccaca 

4926 cccaaggccacaggggtcc 

4927gtgttctagacctctccac 

4928ccagaatctgtaccaggaa 

4929tgagaactacgagdgact 

4930catgaaggccaaattccga 

4931 cttccagacacctgatttt 

4932gaatttacaattgttgctt 

4933gagcttcaggaagcttctc 

4934 accagtcagatattgttgc 

4935tagaatatgaactaaatgt 

4936gtagctgagaaaatcaatg 

4937ggtggataccctgaagttt 

4938aggaaaagcgcacctcaat 

4939ccagcttccccacatctca 

4940gtaagaaaatacagagcag 

4941 aggacagagccttggtgga 

4942atctgatgaggaaactcag 

4943 ggcctctctggggcatcta 

4944ctctcacaaaaaagta1ag 

4945cttcaggaagcttctcaag 

4946 ccttacacaataatcacaa 

4947gcacctagctggaaagtta 

4948tctgtgggattccatctgc 

4949 gtctgtgggattccatctg 

4950tcaccaacggagaacatac 

4951 ttcaccaacggagaacata 

4952aatctcaagctttctcttc 

4953tgtaGaactggtccgcctc 



3686 


3707 


A 4 

1 3 


3687 


3706 


A A 

1 3 


6571 


8590 


A <9 

1 3 


oo70 


8589 


A 4 

1 3 




12546 


A 4 

1 3 




2753 


4 O 
1 O 




1U79U 


1 i5 


QCQQ 

oooo 


0007 


A O 

1 3 


cXil £. 




•1 o 

1 o 


7QQQ 


/yoo 


A O 


OO I** 


oOoo 


A t 






4 9 
I O 


OlUf 




A Q 


lU/ lo 




A O 
1 0 




IZoOO 


A A 

1 3 


QfOJ 


0756 


A A 

1 3 


o3o3 


IT Ann 

5402 


A A 

1 3 


Kf\or\ 

OOoO 


5099 


A A 

1 3 


3059 


4 ATA 

3078 


4 A 

1 3 


4106 


4125 


1 3 


11236 


11255 


1 3 


4oo5 


4404 


1 3 


12341 


<l A A A A 

12360 


1 3 


4179 


A A 

4198 


1 3 


12562 


A 0%m0*A 

12581 


1 3 


4807 


A AAA 4 

4826 


1 3 


7do9 


7658 


1 A 

1 3 


7951 


•TA"fA * 

7970 


1 3 




d2oo 


1 3 






1 3 




1 021 0 1 


A 

3 


11oo9 


A A nno •) 

11908 1 


3 




"74 OC A 


A 

3 






A 

3 




IZOOO 1 


A 

3 




oooO 1 


3 


6440 


6459 1 


3 


o192 


OA A A A 

3211 1 


3 


12259 


A AA*VB A 

12278 1 


3 


5144 


CA 04 A 

5163 1 


3 


6549 


6568 1 


3 


13217 


13236 1 


3 


9530 


9549 1 


3 


6955 


6974 1 


3 


4091 


4110 1 


3 


4090 


4109 1 


3 


10851 


10870 1 


3 


10850 


10869 1 


3 


10052 


10071 1 


3 


4215 


4234 1 


3 
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SEQ ID NO: 


3813 


agggcaaaacgtcttacag 


3372 3391 SEQ ID NC 


>: 4954ctgttaggacaccagccct 


4062 


4081 


1 3 


SEQ ID NO: 


3614 


gactcaccctggacattca 


3390 3409SEQIDNC 


): 4955tgaaattcaatcacaagtc 


9076 


9095 


1 3 


SEQ ID NO: 


3616 


ctggacattcagaacaaga 


3398 3417SEQIDNC 


\: 4956tcttttctmcagcccag 


9226 


9245 


1 3 


SEQ ID NO: 


3616 


tcatgggcgacctaagttg 


3435 3454SEQIDNC 


f: 4957caactgcagacatatatga 


6635 


6654 


1 3 


SEQ ID NO: 


3617 


tgggcgacctaagtfgtga 


3438 3457SEQIDNC 


f: 4958tcactccattaacctccca 


6316 


8335 


1 3 


SEQ ID NO: 


3618 


agttgtgacacaaaggaag 


3449 3468 SEQ ID NC 


I: 4959cttcttttccaattgaact 


13838 


13857 


1 3 


SEQ ID NO: 


3619 


tgacacaaaggaagaaaga 


3454 3473SEQIDNC 


f: 4960tcttcatcttcatctgtGa 


10220 


10239 


1 3 


SEQ ID NO: 


3620 


gacacaaaggaagaaagaa 


3455 3474SEQIDNC 


4961ttcttcatcttcatctgtc 


10219 


10238 


1 3 


SEQ ID NO: 


3621 


ggaagaaagaaaaatcaag 


3463 3482SEQIDNO 


4962 cttgtcatgcctacgttcc 


11348 


11367 


1 3 


SEQ ID NO: 


3622 


aaaatcaagggtgttattt 


3473 3492SEQIDNO 


: 4963aaatcttattggggatttt 


7084 


7103 


1 3 


SEQ ID NO: 


3623 


tccataccccgtttgcaag 


3491 3510SEQIDN0 


4964 cttggattcaaaatgtgga 


6858 


6877 


1 3 


SEQ ID NO: 


3624 


tgcaagcagaagocagaag 


3504 3523SEQIDNO 


4965cttcagggaacacaatgca 


5185 


5204 


1 3 


SEQ ID NO: 


3625 


cagaagccagaagtgagat 


3510 3529SEQIDNO 


4066 atctatgocatctcttctg 


5633 


5652 


1 3 


SEQ ID NO: 


3626 


fgagatoctcgcccactgg 


3523 3542SEQIDNO 


4987ccagcttccccacatctca 


8341 


8360 


1 3 


SEQ ID NO: 


3627 


ggtcgcctgccaaactgct 


3540 3659SEQIDNO 


4968 agcacatatgaactggacc 


13947 


13966 


1 3 


SEQ ID NO: 


3628 


tgcttctccaaatggactc 


3555 3574sEQ|Djgo 


4969 gagtttatcagtcagagca 


9701 


9720 


1 3 


SEQ ID NO; 


3629 


tggactcatctgctacagc 


3567 3586SEQIDNO 


4970 gctgcagtggcccgttcca 


8167 


8186 


1 3 


SEQ ID NO: 


3630 


gclacagcttatggctcca 


3578 3597SEQIDNO 


4971 tggaggacattcctctagc 


8211 


8230 


1 3 


SEQ ID NO: 


3631 


ggtggcatggcattatgat 


3610 3629SEQIDNO 


4972atcacaaattagtttcacc 


8947 


8966 


1 3 


SEQ ID NO: 


3632 


agagaagattgaatttgaa 


3631 3650SEQIDNO 


4973ttcaacgatacctgtctct 


7713 


7732 


1 3 


SEQ ID NO: 


3633 


caggcaccaatgtagatac 


3657 3676sEQIDNO 


4974gtatgctaatagactcctg 


3736 


3755 


1 3 


SEQ ID NO: 


3634 


gacttccaatttccctgtg 


3685 3704SEQIDNO 


4975cacaatgcaaaa(ttcagtc 


5195 


5214 


1 3 


SEQ ID NO: 


3635 


gtc»ctcaaacagacatga 


3764 3783sEQ ID NO 


4976lcataagggaggtagggac 


12777 


12796 


1 3 


SEQ ID NO: 


3636 


caaacagacatgactttcc 


3770 3789SEQIDNO 


4977ggaactacaatttcatttg 


7022 


7041 


1 3 


SEQ ID NO: 


3637 


atagttgcaatgagctcat 


3809 3828SEQIDNO: 


4978 atgatttgaaaatagctat 


6693 


6712 ' 


1 3 


SEQ ID NO: 


3638 


gcttcagaaggcatctggg 


3829 3848SEQIDNO: 


4979cccaagaggtatttaaagc 


12957 


12976 


1 3 


SEQ ID NO: 


3639 


ggagttcaacctcsagaac 


3896 3914SEQIDNO: 


4980gttcactocattaacctcc 


6314 


6333 


1 3 


SEQ ID NO: 


3640 


agaaaaoctcttcttaaaa 


3940 3959SEQIDNO: 


4981 ttttctaaatggaacttct 


12173 


12192 1 


1 3 


SEQ ID NO: 


3641 


aaaacctcttcttaaaaag 


3942 3961 SEQ ID NO: 


4962ctftgaaaaattctctm 


. 9213 


9232 1 


1 3 


SEQ ID NO: 


3642 


aaaaagcgatggccgggtc 


3955 3974SEQIDNO: 


4983 gaccttgcaagaatatttt 


6343 


6362 1 


1 3 


SEQ ID NO: 


3643 


gtcaaatatacctfgaaca 


3971 3990SEQIDNO: 


4984tgttaacaaattccttgac 


7363 


7382 1 


1 3' 


SEQ ID NO: 


3644 


tgaacaagaacagtttgaa 


3984 4003SEQIDNO: 


4985ttcaagttcctgaccttca 


8310 


8329 1 


3 


SEQ ID NO: 


3645 


agtttgaaaattgagattc 


3995 4014SEQIDNO: 


4986 gaatctggctccctcaact 


9047 


9066 1 


3 


SEQ ID NO: 


3646 


gtttgaaaatfgagattcc 


3996 4015SEQIDNO: 


498 7 ggaaataccaagtcaaaac 


10454 


10473 1 


3 


SEQ ID NO: 


3647 


ttgaaaattgagattcctt 


3998 4017SEQIDNO: 


4988 aaggaaaagcgcacctcaa 


12030 


12049 1 


3 


SEQ ID NO: 


3648 


ctaaagatgttagagadg 


4046 40e5sEQIDNO: 


4989cagttgaocacaagcttag 


10545 


10564 1 


3 


SEQ ID NO: 


3649 


atgttagagactgttagga 


4052 4071SEQIDNO: 


4990tccttaacaccttccacat 


8073 


8092 1 


3 


SEQ ID NO: 


3650 


cagcoctccacttcaagtc 


4074 4093SEQIDNO: 


4991 gacttctctagtcaggctg 


8813 


8832 1 


3 


SEQ ID NO: 


3651 


agccctccacttcaagtct 


4075 4094SEQIDNO: 


4992agacatcgctgggctggct 


5728 


5747 1 


3 


SEQ ID NO: 


3652 


ocatctgccatctcgagag 


4102 4121SEQIDNO: 


4993ctctcaaatgacatgatgg 


5330 


5349 1 


3 


SEQ ID NO: 


3653 


attcccaagttgtatcaac 


4142 4161SEQIDNO: 


4994 gttgagaagccccaagaat 


6254 


6273 1 


3 


SEQ ID NO: 


3654 


tcaactgcaagtgcctctc 


4156 4175SEQIDNO: 


4995gagatcaagacactgttga 


8843 


8862 1 


3 


SEQ ID NO: 


3655 


ggtgttctagacctctcca 


4178 4197SEQIDNO: 


4996 tggaaccctctccctcacc 


4735 


4754 1 


3 


SEQ ID NO: 


3656 


ctccacgaatgtctacagc 


4192 4211 SEQ ID NO: 


4997 gctggtaacctaaaaggag 


5588 


5607 1 


3 


SEQ ID NO: 


3657 


cacgaatgtctacagcaac 


4195 4214SEQIDNO: 


4998gttgcccaccatcatcgtg 


11671 


11690 1 


3 


SEQ ID NO: 


3658 


aogaatgtctacagcaact 


4196 4215SEQIDNO: 


4999agttgcccaccatcatcgt 


11670 


11689 1 


3 


SEQ ID NO: 


3859 


Icctacagtggtggcaaca 


4232 4251SEQIDNO: 


SOOOtgttagttgctcHaagga 


13359 


13378 1 


3 


SEQ ID NO: 


3660 


cgttaccacatgaaggctg 


4280 4299SEQIDNO: 


5001 cagcaagtacctgagaacg 


8611 


8630 1 


3 


SEQ ID NO: 


3661 


gaaggctgactctgtggtt 


4291 4310SEQIDNO: 


5002aacctatgccttaatcttc 


13169 


13188 1 


3 


SEQ ID NO; 


3662 


tgtggttgaoctgctttcc 


4303 4322SEQIDNO: 


5003 ggaaagttaaaacaacaca 


6965 


6984 1 


3 
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SEQ ID NO 


3663 

www 


cctgctttcctacaatgtg 


4312 


4331 SEQ ID NO 


5004 cacaocttaacattacaaa 


11088 

I 1 WWW 


11107 

1 1 1 w f 


1 3 


SEQ ID NO 


3684 


ctgctttcctacaatgtgc 


4313 


4332 SEQ ID NO 


5005 acacaccttaacattacaa 


11087 

1 1 WW t 


11106 

III WW 


1 3 


SEQ ID NO 


3665 


tcctacaatgtgcaaggat 


4319 4338SEQIDNO 


5006 atccoctagctctaaaaaa 


8577 

WW f f 


8596 

wwww 


1 3 


SEQ ID NO 


3666 


tatgaccacaagaatacgt 


4352 


4371 SEQ ID NO 


5007 acqtccqtqtaccttcata 


9984 

V w w » 


10003 

1 wwww 


1 3 


SEQ ID NO 


3667 


atgaccacaagaatacgtc 


4353 


4372SEQ ID NO 


5008 q acatccotataccttcat 

^ w w w w H^^n^^'^n *q *www*fcwvi h 


9983 

wwU w 


10002 


1 3 


SEQ ID NO 


3668 


gaatacgtctacactatca 


4363 


4382SEQ ID NO 


5009iaattatctaaattcattc 


6487 


6508 


1 3 

1 w 


SEQ ID NO 


3669 

* wwo 


tttctagattcgaatatca 


4406 4426SEQIDNO 


501 Otaatttacataaitteiaaa 


6685 

wwww 


6704 


1 3 

1 w 


SEQ ID NO 


3670 


gattcgaatatcaaattca 


4412 


4431 SEQ ID NO 


SO1 1 taaaotaactaanaaaatc 


7102 

/ 1 W<b 


7191 


1 3 


SEO ID NO 




gaaacaacccagtctcaaa 


4449 


4468 SEQ ID NO 


so 1 2 ItlaaaaaattctcHttc 


Q914 
9^ l*r 


0911 
OilOw 


1 1 
1 W 


SPO ID NO 


wOr & 


cccagtctcaaaagqttta 


4456 


4475SEQ ID NO 


» 5013taaflttcatfflcfRrfnnn 
I WW 1 w laaciuwciiMiwiwwiyyM 


1 IwU^ 


11191 
1 lw& 1 


1 1 

1 w 


SPO ID NO 




ctcaaaaggtttactaata 

■ ■•■w*^^^ £9 * **W MAIM % 


4462 4481SEQIDNO 


• 501 ^Inttrassartnantfnsin 




1 A«WU 


1 1 

1 w 


SFO in NO 




tcaaaaggfttactaatat 


4463 4482SEQIDNO 


< 5ni5ntnffr*nnAAr*(nartHna 
WW 1 waiaiiv/OaciciV/l^ciyiiya 




400AQ 


1 1 
1 w 


^po in NO 




aaaaagtttactaatattc 


4465 4484SEQIDNO 


501 R dflatttnaaanttrnfttt 
■ ww 1 uyciaiiiyclclciyili^yilli 






1 1 
1 O 


SPO ID NO 




gaaacaqcatttQtttotc 


4543 


4562SEQ ID NO 


• 501 7 aafManatrttrnfntttr 


11914 


11911 


1 1 
1 O 


SEQ ID NO 


3677 

• WUf f 


atttgtttgtcaaagaaqt 


4551 


4570SEQ ID NO 


• &018acitaBaaaa(ataaanflt 


8022 

wl/£& 


8041 


1 1 
1 w 


SEO ID NO 


3678 


tcaagattgatgggcagtt 


4569 


458BSEQ ID NO 


50 1 9 aactctcaantcaaatf na 

WW i0aawiw(WCiCijJlWCIQH»il4OI 


11499 


A1AAA 
1 wH*l 1 


1 1 
1 w 


SEO ID NO 




ttcagagtctcttcgttct 


4586 4605SEQIDNO 


5020aoaanafnnnaaatttnna 

wwfcw ajjaci^cii^^wciaaiLiuaa 


1 1 9<7W 


i9niii 


1 1 
1 O 


SEQ ID NO 


3680 


cagagtctcttcgttctat 


4568 


4607SEQ ID NO 


5021 ataacataaacttcttcta 


8871 

Ow f w 


OwC7i» 


1 1 
1 w 


SEQ ID NO 


3681 

www 1 


atgctaaaggcacatatgg 


4605 


4624SEQ ID NO 


5022 ccatttaaaatcacodcat 


9245 

Wfc*»W 


0964 


1 3 

i w 


SEQ ID NO 


3682 


gcacatatggcctgfcttg 


4614 


4633SEQ ID NO 


5023 caaaitciacaaataacif dc 

w wfcw w%i»nj mjijwwcuj wibhj IgJW 


9379 

WWf & 


www 1 


1 1 

1 w 


SEQ ID NO 


3683 

> WWWW 


gagtccaacctgaggttta 


4667 


4688SEQ ID NO 


5024 taaaotaccacftttactc 


61 90 
w 1 w 


620g 

w£wO 


f 1 

1 w 


SEQ ID NO 


3684 


agtccaacctgaggtttaa 


4668 


4687SEQ ID NO 


5025ttaacaaaaaaaataQact 


9308 


Q127 


1 1 

1 w 


SEQ ID NO 


3685 


cctacctccaaggcaccaa 


4692 


4711 SEQ ID NO 


5026ttqgcaagtaaqtgctaqq 


9376 

W^^ f 


9395 ' 

wwww 


1 3 

1 w 


SEQ ID NO 


3686 

^# ^# ^^^^ 


gaagatggaaccctctccc 


4730 


4749SEQ ID NO. 


5027 qqqaagaagaagcaacttc 


12291 

V MwW 1 


12310 ' 

1 hW I W 


I 3 

i w 


SEQ ID NO 


: 3687 


tgatctgcaaagtggcatc 


4762 


4781 SEQ ID NO- 


5028gatgaggaaacteagatca 


12263 


12282 ' 


1 3 


SEQ ID NO 


3688 


gatctgcaaagtggcatca 


4763 


4782SEQ ID NO 


5029tgatqaqqaaactcagatc 


12262 


12281 


1 3 

1 w 


SEQ ID NO 


3689 


gcttccctaaagtatgaga 


4793 


4812SEQ ID NO. 


5030 tctcgtgtcta g g aa aa qc 


6977 


5996 


I 3 

1 w 


SEQ ID NO 


3690 


gtatgagaactacgagctg 


4804 


4823SEQ ID NO, 


5031 caqcttaaqaaacacatac 


6920 

w wfcw 


6939 

Wwww 


I 3 

1 w 


SEQ ID NO: 


3691 

Www 1 


tctaacaagatggatatga 


4868 


4887SEQ ID NO; 


5032tcattttccaactaataaa 


13032 

1 w ww^B 


13051 1 

1 www 1 


I 3 

1 w 


SEQ ID NO: 


3692 


ctgctgcgttctgaatatc 


4907 


4926SEQ ID NO: 


5033 aatacaaaaaaaacteicaa 


6901 

Www 1 


6920 1 


I 3 

1 w 


SEQ ID NO: 


3693 


tcattgaggttcftcagcc 


4940 4969SEQ ID NO: 


5034 gqctcatatqctqaaataa 


5348 

W^* i^W 


5367 1 


1 3 

1 w 


SEQ ID NO" 


'^6Q4 


ttctggatcactaaattcc 


4963 


4982 SEQ ID NO: 


5035qqaaQqacaaqQCCcaaaa 

wwww ww^27^ vmm^^wwwmMMm 


12549 


1 2568 1 


1 3 

1 w 


SEQ ID NO- 


3695 


ccatggtcttgagttaaat 


4981 


5000SEQ ID NO: 


5036 atttttattf^taccatoa 

wwww a iMMQ ii^^iu wwa luu 


IVI lUw 


ini99 i 


I 1 

w 


SEQ ID NO' 
SPO ID NO- 


'^RQ7 

OU«7/ 


tcttaggcactgacaaaat 
acaaqgcgacactaaQQat 


5007 
5040 


6026SEQ ID NO: 
5059SEQ ID NO: 


5037attttttacaaattaaaaa 


Itw 1 w 

R7Q4 


1401R i 

l*TVOO 

RA11 -1 
DO 1 0 


1 1 
1 w 

o 
o 


wwwf ci(uiii||wMCiyiiGwiGiyci 

6038 atcRatn af rta raif tnf 


SEQ ID NO* 




tccaacgaccaacttqaaq 


5083 


5102SEQ ID NO: 


503fl rrf Inannna ana paafnpa 


w 1 Ow 




Q 

w 


SEQ ID NO- 


3699 


caacttgaagtgtagtctc 


5092 


51 11 SEQ ID NO: 


6040 aaciataaaaciatansatio 




6258 1 


•a 

w 


SEQ ID NO* 


3700 
wr WW 


gctggagaatgagctgaat 


5116 


5135SEQ ID NO: 


5041 attctcttttcttttcaac 

ww^ f wi ^%w^w%»*»wfcmwoMw 


8222 


0241 1 


o 

w 


SEQ ID NO' 


3701 


gcagagcttggoctctctg 


6135 5154SEQ ID NO: 


5042 caaatanaanaaaaartnp 
ww^b wdyciiawaGi^aaGiciciwi^w 


DO wo 


8018 1 


q 

w 


SEQ ID NO' 


3702 

w r wb 


tctctggggcatctatgaa 


5148 


5167SEQ ID NO: 


604 3 ttcattcaatta aa aa aa a 


649q 


6518 1 

v/w 1 u 1 


Q 
w 


SEQ ID NO' 


3703 


tctggggcatctatgaaat 


5150 


5169SEQ ID NO: 


5044atttQtaaaaaaatacaaa 


6436 

U"TWW 


645<) 1 

Vtww I 


o 


SEQ ID NO: 


3704 


aacacaatgcaaaattcag 


5193 


5212SEQ ID NO: 


5045 ctgaagcattaaaactgtt 


7506 


7525 1 


3 


SEQ ID NO: 


3705 


ctcacagagctatcactgg 


5231 


5250SEQ ID NO: 


5046ccagat9Ctgaacagtgag 


8149 


8168 1 


3 


SEQ ID NO: 


3706 


tgggaagtgcttatcaggc 


5247 


6266SEQ ID NO: 


S047gcctacottocatgtc3cca 


11356 


11375 1 


3 


SEQ ID NO: 


3707 


ttcaaggtcagtcaagaag 


5303 


6322SEQ ID NO: 


5048 cttcagtgcagaatatgaa 


11977 


11998 1 


3 


SEQ ID NO: 


3708 


aatgacatgatgggctcat 


5336 


5355SEQ ID NO: 


5049 atgattatctgaattcatt 


6486 


6505 1 


3 


SEQ ID NO: 


3709 


gctcatatgctgaaatgaa 


5349 


6368SEQ ID NO: 


5050ttcagccattgacatgagc 


5746 


5765 1 


3 


SEQ ID NO: 


3710 


atatgctgaaatgaaattt 


5353 


5372SEQ ID NO: 


5051 aaatagctattgctaatat 


6702 


6721 1 


3 


SEQ ID NO: 


3711 


tctgaacattgcaggctta 


5386 


5405 SEQ ID NO: 


5052 taagaaccagaagatcaga 


10996 


11015 1 


3 


SEQ ID NO: 


3712 


gaacattgcaggcttatca 


5389 


5408 SEQ ID NO: 


5053tgatatcgacgtgaggltc 


12490 


12509 1 


3 
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SEQ ID NO: 


3713 


tgcaggcttatcactggac 


5395 


5414sEQ IP |v|c 


I- 5054gtcctggattccacatgca 


11852 


11871 


1 


3 


SEQ ID NO: 


3714 


tcaaaacttgacaacattt 


5420 


5439 SEQ ID NC 


|. 6055aaattccttgacatgttga 


7370 


7389 


1 


3 


SEQ ID NO: 


3715 


atttacagctctgacaagt 


5435 


6454SEQ ID NC 


1: 5056acttaaaaaatataaaaat 


8022 


8041 


1 


3 


SEQ ID NO: 


3716 


ctctgacaagttttataag 


5443 


5462 SEQ ID NO 


i: 5057cttacttgaattccaagag 


10674 


10693 


1 


3 


SEQ ID NO: 


3717 


gttaatttacagctacagc 


5468 


5487SEQ ID NO 


I; 5058gctgcatgtggctggtaac 


5578 


5697 


1 


3 


SEQ ID NO: 


3718 


ttctctggtaactacttta 


5491 


5510SEQ ID NO 


)• 505gtaaaagattactttgagaa 

■ %0 ill/ 1^ 


7275 


7204 


1 


3 


SEQ ID NO: 


3719 


cctaaaaggagcctaccaa 


5596 


5615SEQ ID NO 


: 5060ttggcaag1aagtgctagg 


9376 


9395 


1 


3 


SEQ ID NO: 


3720 


aaaaggagcctaccaaaat 


5599 


^I^SEQID NO 


5061atttacaattgttgctttt 


6271 


6290 


1 


3 


SEQ ID NO: 


3721 


aggagcctaccaaaataat 


5602 


5621 SEQ ID NO 


6062attacctatgatttctcct 


10127 


10146 


1 


3 


SEQ ID NO: 


3722 


ataatgaaataaaacacat 


5616 


5635SEQIDNO 


5063atgtcaaacactttgttat 


7065 


7084 


1 


3 


SEQ ID NO: 


3723 


aaaacacatctatgccatc 


5626 


5645SEQ ID NO 


5064gatgaagatgacgactttt 


12158 


12177 


1 


3 


SEQ ID NO: 


3724 


tgctaaggttcagggtgtg 


5686 


5705SEQ ID NO 


5065cacaagtcgatteccagca 


9087 


9106 


1 


3 


SEQ ID NO: 


3725 


gagtttagccatcggctca 


5705 


5724SEQ ID NO 


5066tgaggtgactcagagactc 


7450 


7469 


1 


3 


SEQ ID NO: 


3726 


gctggcttcagccattgac 


5740 


5759SEQ ID NO 


5087gtcagtgaagttctccagc 


8596 


8615 


1 


3 


SEQ ID NO: 


3727 


atttcagcaatgtcttccg 


5790 


5809SEQ ID NO 


: 5068cggagcatgggagtgaaat 


8628 


8647 


1 


3 


SEQ ID NO: 


3728 


tttcagcaatgtcttccgt 


5791 


5810SEQ ID NO 


5069acggagcatgggagtgaaa 


8627 


8646 


1 


3 


SEQ ID NO: 


3729 


ttcagcaatgtcttccgtt 


5792 


5811 SEQ ID NO 


5070aacggagcatgggagtgaa 


8626 


8645 


1 


3 


SEQ ID NO: 


3730 


cagcaatgtcttccgttct 


5704 


5813SEQ ID NO 


5071 agaagtgtcttcaaagclg 


12412 


12431 


1 


3 


SEQ ID NO: 


3731 


tgtcttccgttctgtaatg 


6800 5819SEQ ID NO 


6072cattcaattgggagagaca 


6501 


6520 


1 

t 


3 


SEQ ID NO: 


3732 


gtcttccgttctgtaatgg 


5801 


6820SEQ ID NO 


: 6073ccattcagtctctcaagac 


12975 


12994 


1 


3 


SEQ ID NO: 


3733 


atgggaaactcgctctctg 


5869 


5878SEQ ID NO 


6074cagataaaaaactcaccat 


12213 


12232 


1 


3 


SEQ ID NO: 


3734 


ggagaacatactgggcagc 


5879 


6898SEQ ID NO 


5075gctgttttgaagactctGC 


1088 


1107 


1 


3 


SEQ ID NO: 


3736 


gttgaaagcagaaoctctg 


5914 6933SEQIDNO 


5076cagaattcataatcccaac 


8274 


8293 


1 


3 


SEQ ID NO: 


3736 


gtctaggaaaagcatcagt 


5983 


6002SEQ ID NO 


5077 actgcaagatttttcagac 


13612 


13631 


1 


3 


SEQ ID NO: 


3737 


agcatcagtgcagctcttg 


6993 


8012SEQ ID NO 


5078caagaacctgttagttgct 


13351 


13370 


1 


3 


SEQ ID NO: 


3738 


ttgaacacaaagtcagtgc 


6009 


6028SEQ ID NO 


5079gcacatcaatattgatcaa 


6418 


6437 


1 


3 


SEQ ID NO: 


3739 


gcagacaggcacctggaaa 


6046 


6085SEQ ID NO 


5080tttcagatggcattgctgc 


11610 


11629 


1 

• 


3 


SEQ ID NO: 


3740 


gaaactcaagacccaattt 


6061 


6080SEQ ID NO 


5081 aaatcccatccaggttttc 


8037 


8066 


1 


3 


SEQ ID NO: 


3741 


acaatgaatacagccagga 


6084 


6''03SEQ ID NO: 


6082tcctttggctgtgctttgt 


9882 


9701 


1 


3 


SEQ ID NO: 


3742 


cttggatgcttacaacact 


6103 


6122SEQ ID NO: 


508 3 agtgaagttctcx:agcaag 


8509 


8618 


1 


3 


SEQ ID NO: 


3743 


ttggcgtggagcttactgg 


6132 


6151 SEQ ID NO: 


5084ccagaattcataatcGcaa 


8273 


8292 


1 


3 


SEQ ID NO: 


3744 


cacttttactcagtgagcc 


6198 


6217SEQ ID NO: 


5085ggctattgatgttagagig 


6988 


7007 


1 


3 


SEQ ID NO: 


3745 


tttagagatgagagatgcc 


6235 


9254SEQ ID NO: 


5086ggcatgatgctcatttaaa 


9177 


9196 


1 


3 


SEQ ID NO: 


3746 


gagaagccccaagaattta 


6257 


6276SEQ ID NO: 


5087taaagccattcagtctctc 


12970 


12989 


1 


3 


SEQ ID NO: 


3747 


caattgttgcttttgtaaa 


6276 


6295SEQ ID NO: 


508 6 tttaaccagtcagatattg 


10187 


10206 


1 


3 


SEQ ID NO: 


3748 


ttttgtaaagtatgataaa 


6286 


6305SEQ ID NO: 


5089tttattgctgaatccaaaa 


. 13655 


13674 


1 


3 


SEQ ID NO: 


3749 


ttgtaaagtatgataaaaa 


6288 


6307SEQ ID NO: 


5090ttttgagaggaatcgacaa 


6358 


6377 


1 


3 


SEQ ID NO: 


3750 


ttcactccattaacctocc 


6316 


6334SEQ ID NO: 


5091 gggaaaaaacaggcttgaa 


9576 


9595 


1 


3 


SEQ ID NO: 


3751 


ttttgagaccttgcaagaa 


6337 


6356SEQ ID NO: 


50g2ttctctctatoggaaaaaa 


9566 


9585 


1 


3 


SEQ ID NO: 


3752 


accttgcaagaatattttg 


6344 


6363SEQ ID NO: 


5093caaaagaagcccaagaggt 


12948 


12967 


1 


3 


SEQ 10 NO: 


3753 


tcaatattgatcaatttgt 


6423 


6442SEQ ID NO: 


5094 acaaagcagattatgttga 


11829 


11848 


1 


3 


SEQ ID NO: 


3754 


cagagcagccctgggaaaa 


6451 


6470SEQ ID NO: 


6096ttttcagaccaactctctg 


13622 


13641 


1 


3 


SEQ ID NO: 


3765 


cctgggaaaactcccacag 


6460 


6479SEQ ID NO: 


5096ctgtctctggtcagccagg 


7724 


7743 


1 


3 


SEQ ID NO: 


3766 


actcccacagcaagctaat 


6469 


6488SEQ ID NO: 


5097 attacacttcctttcgagf 


12869 


12888 


1 


3 


SEQ ID NO: 


3757 


aattcattcaattgggaga 


6497 


6516SEQ ID NO: 


5098 tctcttcctccatggaatt 


10479 


10498 


1 


3 


SEQ ID NO: 


3758 


ttcaattgggagagacaag 


6503 


6522SEQ ID NO: 


5099cttggagtgccagmgaa 


11808 


11827 


1 


3 


SEQ ID NO: 


3759 


aggagaaactgactgctct 


6534 


6553SEQ ID NO: 


SlOOagagcttatgggatttcct 


11163 


11182 


1 


3 


SEQ ID NO: 


3760 


actgactgctctcacaaaa 


6541 


6560SEQ ID NO: 


5101 ttttggcaagctatacagt 


8380 


8399 


1 


3 


SEQ ID NO: 


3761 


gactgctctcacaaaaaag 


6544 


6563 SEQ ID NO: 


5102ctttgtgagtttatcagtc 


9695 


9714 


1 


3 


SEQ ID NO: 


3762 


cagacatatatgatacaat 


6641 


6860SEQ ID NO: 


51 03attggatafocaagatctg 


1933 


1952 


1 


3 
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SEQ ID NO: 


3763 


aatttgatcagtatattaa 


6657 


6676SEQ ID NC 


): 5104ttaaaagaaatcttcaatt 


13815 


13834 


1 3 


SEQ \D NO: 


3764 


tatgatttacatgatttga 


6683 


6702SEQ ID NC 


1: 5105tcaatgattatatcccata 


13128 


13147 


1 3 


SEQ ID NO: 


3765 


tttgaaaatagctattgct 


6697 


6716SEQ ID NC 


f : 51 06agcacagaaaaaattcaaa 


13864 


13883 


1 3 


SEQ ID NO: 


3768 


ttgaaaatagctattgcta 


6698 


6717SEQ ID NC 


): 5107tagcacagaaaaaattcaa 


13863 


13882 


1 3 


SEQ ID NO: 


3767 


aatagctattgctaatatt 


6703 


6722SEQ ID NC 


I: 51 08 aataaatggagtctttatt 


14084 


14103 


1 3 


SEQ ID NO: 


3768 


atfattgatgaaatcattg 


6719 


6738 SEQ ID NO 


i: 51 09caataccagaattcataat 


8268 


8287 


1 3 


SEQ ID NO: 


3769 


aaagtcttgatgagcacta 


6747 


6766 SEQ ID NO 


>: 5110fagtgattacacttccttt 


12864 


12883 


1 3 


SEQ ID NO: 


3770 


aagtcttgatgagcactat 


6748 


6767SEQ ID MO 


i; 511 1 atagcaacactaaatactt 


8769 


8768 


1 3 


SEQ ID NO: 


3771 


ttgatgegcactatcatat 


6753 


6772SEQ ID NO 


511 2 atatccaagatgagatcaa 


13101 


13120 


1 3 


SEQ ID NO: 


3772 


taattttagtaaaaacaat 


8777 


5796SEQ ID NO 


51 13attgagattccctccatta 


11702 


11721 


1 3 


SEQ ID NO: 


3773 


ttttagtaaaaacaatcca 


6780 


6799 SEQ ID NO 


; 61 Mtggagtgccagtttgaaaa 


11810 


11829 


1 3 


SEQ ID NO: 


3774 


acatttgtttattgaaaat 


6805 


6824SEQ ID NO 


51 15atttcctaaagctggatgt 


11175 


11194 


1 3 


SEQ ID NO: 


3775 


attgattttaacaaaagtg 


6824 6843SEQIDNO 


5116cactgttccagttgtcaat 


9871 


9890 


1 3 


SEQ ID NO: 


3776 


attttaacaaaagtggaag 


6828 


6847 SEQ ID NO 


; 51 17cttcaaagacttaaaaaat 


8014 


8033 


1 3 


SEQ ID NO: 


3777 


aaatcagaatccagataca 


6888 


6907sEQ ID NO 


: 61 1 Stgtaccataagccatattt 


10088 


10107 


1 3 


SEQ ID NO: 


3778 


gaatccagatacaagaaaa 


6894 


6913SEQ ID NO 


51 1 0ttttctaaacttgaaattc 


9065 


9084 


1 3 


SEQ ID NO: 


3779 


ttaagagacacatacagaa 


6924 


6943 SEQ ID NO 


: 5120ttcttaaacattccmaa 


9491 


9510 


1 3 


SEQ ID NO; 


3780 


atccagcacctagctggaa 


6950 


6969sEQ ID NO 


5121 ttccaatttccctglggat 


3688 


3707 


1 3 


SEQ ID NO: 


3781 


tgagcatgtcaaacacttt 


7060 


7079 SEQ ID NO 


: 51 22 aaagtgccacttttactca 


6191 


6210 


1 3 


SEQ ID NO: 


3782 


gagcatgtcaaacactttg 


7061 


7080 SEQ ID NO 


51 23 caaatgacatgatgggctc 


5334 


5353 


1 3 


SEQ ID NO: 


3783 


aaacactttgttataaatc 


7070 


7089SEQ ID NO 


5124gattatatcccatatgttt 


13133 


13152 


1 3 


SEQ ID NO: 


3784 


tgagaaaafcaatgocttc 


7111 


71 30 SEQ ID NO 


5125gaaggaaaagcgcacctca 


12029 


12048 


1 3 


SEQ ID NO: 


3785 


tatgaagtagaccaacaaa 


7160 


7179sEQ ID NO 


5126tttgtggagggtagtcata 


10331 


10350 ' 


1 3 


SEQ ID NO: 


3786 


aagtagaccaacaaatcx:a 


7164 


7183sEQ ID NO 


51 27tggatgaagatgacgactt 


12156 


12175 ' 


1 3 


SEQ ID NO: 


3787 


aagttgaaggagactattc 


7223 


7242 SEQ ID NO 


51 28 gaataccaatgctgaactt 


10168 


10187 ' 


1 3 

■ w 


SEQ ID NO: 


3788 


acaagttaagataaaagat 


7264 


7283 SEQ ID NO 


512gatctaaattcagttcngt 


11334 


11353 1 

w ■ ^^^^^^ 


1 3 


SEQ ID NO: 


3789 


aagataaaagattactttg 


7271 


7290SEQ ID NO' 


51 30caaaatagaagggaatctt 


2077 


2096 1 


1 3 

1 w 


SEQ ID NO: 


379D 


gattactttgagaaattag 


7280 


7299SEQ ID NO: 

^^^^^^ f ^ 1 


5131 ctaaacttgaaattcaatc 


9069 


9088 1 


1 3 

r W 


SEQ ID NO: 


3791 


tgagaaattagttggattt 


7288 


7307SEQ ID NO: 


51 32aaatccgtgaggtgactca 


7443 


7462 1 


3 

w 


SEQ ID NO: 


3792 


aaattagttggatttattg 


7292 


731 1 SEQ ID NO: 


51 33 caattttgagaatgaattt 


10419 


10438 1 


3 


SEQ ID NO: 


3793 


tggatttattgatgatgc^ 


7300 


7319SEQ ID NO: 


5134agcatgcctagtttctcca 


9953 


9972 1 


3 


SEQ 10 NO: 


3794 


tcattgaagatgttaacaa 


7353 


7372SEQ ID NO: 


5135ttgtagatgaaaccaatga 


7422 


7441 1 


3 


SEQ ID NO: 


3795 


caftgaagatgttaacaaa 


7354 


7373SEQ ID NO: 


51 36tttgtagatgaaaccaatg 


7421 


7440 1 


3 


SEQ ID NO: 


3796 


attgaagatgttaacaaat 


7355 


7374SEQ ID NO: 


51 37atttaagtatgatttcaat 


10495 


10514 1 


3 


SEQ ID NO: 


3797 


ttgaagatgttaacaaatt 


7356 


7375SEQ ID NO: 


51 38aatttaagtatgatttcaa 


10494 


10513 1 


3 


SEQ ID NO: 


3798 


tgaagatgttaacaaattc 


7357 


7376SEQ ID NO: 


51 39gaatttaagtatgatttca 


10493 


10512 1 


3 

w 


SEQ ID NO: 


3799 


acatgttgataaagaaatt 


7380 


7399 SEQ ID NO: 


51 40aattccctgaagttgatgt 


11487 


11506 1 


3 


SEQ ID NO: 


3800 


tttgattaccaccagtttg 


7406 


7425 SEQ ID NO: 


5141 caaattgaacatccccaaa 


8791 


8810 1 


3 


SEQ ID NO: 


3801 


caaaatccgtgaggtgact 


7441 


74e0sEQ ID NO: 


5142agtccccctaacagatttg 


7972 


7991 1 


3 


SEQ ID NO: 


3802 


aaaatccgtgaggtgactc 


7442 


7461 SEQ ID NO: 


51 43gagtgaaatgctotttttt 


8638 


8657 1 


3 


SEQ ID NO: 


3803 


aggtgactcagagactcaa 


7452 


7471SEQIDNO: 


51 44ttgatgatafctggaacct 


10731 


10750 1 


3 


SEQ ID NO: 


3804 


gtgaaattcaggctctgga 


7473 


7492SEQ ID NO: 


5 1 45tccaatctcctcttttcac 


8409 


8428 1 


3 


SEQ ID NO: 


3806 


gttgcagtgtatctggaaa 


7547 


7566SEQ ID NO: 


51 46tttcaagcaaatgcacaac 


8540 


8559 1 


3 


SEQ ID NO: 


3806 


ttaagttcagcatctttgg 


7616 


7636SEQ ID NO: 


51 47ccaatgctgaactttttaa 


10173 


10192 1 


3 


SEQ ID NO: 


3807 


tgaaggccaaattccgaga 


7641 


7660 SEQ ID NO: 


5148tctccm(;ttcatcttca 


10213 


10232 1 


3 


SEQ ID NO: 


3808 


aatgtatcaaatggacatt 


7684 


7703SEQ ID NO: 


51 49aatgaagtccggattcatt 


11021 


11040 1 


3 


SEQ ID NO: 


3809 


aUcagcaggaacttcaac 


7700 


7719SEQ ID NO: 


51 SOgttgagaagccccaagaat 


6254 


6273 1 


3 


SEQ ID NO: 


3810 


acctgtctctggtcagcca 


7722 


7741 SEQ ID NO: 


51 51 tggcaagtaagtgctaggt 


9377 


9396 1 


3 


SEQ ID NO: 


3811 


cctgtctctggtcagccag 


7723 


7742SEQ ID NO: 


51 52ctggacttctctagtcagg 


8810 


8829 1 


3 


SEQ ID NO: 


3812 


ggtcagcxsaggtttatagc 


7732 


7751 SEQ ID NO: 


51 53gctaaaggagcagttgacc 


10535 


10554 1 


3 
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SFQ in NO* 




SEO ID NO- 


wO l*r 


SEQ ID NO- 




SPO ID NO* 


OO lO 


cpn in MO' 


OO 1 f 


SFO ID NO- 


OO 1 o 


SPO ID NO* 


OO 19 


cpn in NO* 


OOcM 


55FO ID NO* 


00£. 1 


SFO ID NO- 


OOCA 


SFO In NO* 


00£.0 


^FO ID MO' 


00£H 


SFO in NO' 




RFO in MO- 


00£.v 


SFO in NO* 


OO^f 


SFO in NO' 


OO^O 


SFO ID NO' 




SEO ID NO- 


ooou 


SFO in NO' 


ooo \ 


SFO in NO* 


000£, 


SFO ID NO' 


oooo 


SFO ID NO' 




SFO ID NO' 




SFQ ID NO- 




SFO ID NO* 


'^R'^7 


SEQ ID NO- 


3R3fl 


SEQ ID NO' 




SEO ID NO- 


OCrU 


SEO ID NO* 




SEO ID NO* 


3RA9 


SEQ ID NO* 


3Rd3 


SEO ID NO- 




SEO in NO- 


'^Rd.f^ 


SEQ ID NO* 




SFO ID NO* 


"^RAT 


<%FO in MO* 




SEO ID NO* 


'IRdQ 


SFO ID NO* 


'^R^n 


SFO in NO* 


^Rf\1 
009 1 


QFO in KIO* 




ocQ ID NO: 


3853 


SEQ ID NO: 


3854 


SEQ ID NO: 


3855 


SEQ ID NO: 


3856 


SEQ ID NO: 


3857 


SEQ ID NO: 


3858 


SEQ ID NO: 


3859 


SEQ ID NO: 


3860 


SEQ ID NO: 


3861 


SEQ ID NO: 


3862 



ccaggtttatagcacactt 

gtttatagcacacttgtca 

acttgtcacctacatttct 

ctgattggtggactcttgc 

atgaaagcattggtagagc 

tgaaagcattggtagagca 

gggttcactgttcctgaaa 

tcaagaccatccttgggac 

ccttgggaccatgoctgoc 

ttcaggctcttcagaaagc 

ttcagataaacttcaaaga 

acttcaaagacttaaaaaa 

atcccatccaggttttcca 

gaatttaccatccttaaca 

cattccttcctttacaatt 

ttgaccagatgctgaacag 

aatcaccctgocagacttc 

tgaccttcacataccagaa 

ttccagcttccccacatct 

aagctatacagtattctga 

attctgaaaatocaatctc 

tttcacattagatgcaaat 

caaatgctgacatagggaa 

gagagtccaaattagaagt 

agagtccaaattagaagtt 

tctcaattttgattttcaa 

caattttgattttcaagca 

aatgcacaactctcaaacc 

agttctccagcaagtacct 

agtacctgagaacggagca 

tcaaacacagtggcaagtt 

acaatcagcttaccctgga 

ctggatagcaacactaaat 

cigacctgcgcaacgagat 

agatgagggaacacatgaa 

tcaacttttctaaacttga 

ttctaaacttgaaattcaa 

gaaattcaatcacaagtcg 

cactgtttggagaagggaa 

actgtttggagaagggaag 

aattctcttttcttttcag 

ttcttttcagcccagccat 

tttgaaagttcgttttcca 

cagggaagatagacttcct 

ataagtacaaccaaaattt 

acaacgagaacattatgga 

aggaataaatggagaagca 

agcaaatctggatttctta 

tcctttaacaattcctgaa 

tttaacaattcctgaaatg 



7738 7757SEQIDNO 
7742 7761 SEQ ID NO 
7753 7772SEQIDNO 
7770 77B9SEQIDNO 

7847 7866SEQIDNO 

7848 7867SEQIDNO 
7868 7887SEQIDNO 
7887 7906SEQIDNO 
7897 7916SEQIDNO 
7929 7948SEQIDNO 
8004 8023SEQIDNO 
8013 8032SEQIDNO 
8039 8058SEQIDNO 
8083 8082SEQIDNO 
8089 8108SEQIDNO 
8145 8164SEQIDNO 
8233 8252SEQIDNO 
8320 8339SEQIDNO 
8339 8358SEQIDNO 
8387 8406SEQIDNO 
8399 8418SEQIDNO 
8422 8441 SEQ ID NO 
8436 8455SEQIDNO 

8508 8527SEQIDNO 

8509 8528SEQIDNO 
8527 8546SEQIDNO 
8530 8549SEQIDNO 
8549 8668SEQ1DNO 
8604 8623SEQ1DNO 
8616 8635SEQIDNO 
8678 8697SEQIDNO 
8751 8770SEQIDNO 
8765 8784SEQIDNO 
8829 8848SEQIDNO 
8929 8948SEQIDNO 
9060 9079SEQIDNO 
9067 9086SEQIDNO 
9077 9096SEQIDNO 

9141 9160SEQIDNO 

9142 9161 SEQ ID NO 
9221 9240SEQIDNO 
9230 9249SEQIDNO 
9283 0302SEQIDNO 
9312 9331 SEQ ID NO 
9405 9424SEQIDNO 
9435 9454SEQIDNO 
9463 9482SEQ1DNO 
9478 9497SEQIDNO 
9602 9521 SEQ ID NO 
9505 9524SEQIDNO 



51 54 aagtccggattcattctgg 

5155tgacctgtccattcaaaac 

5156agaaaaaggggattgaagt 

51 57 gcaagttaaagaaaatcag 

61 58 gctcatdcxrtttcltcat 

51 59tgctcatctccmcttca 

5160tttcaccatagaaggaccc 

5161 gtcccoctaacagatttga 

5162ggcaccagggctcggaagg 

51 63 gcftgaaggaatfcttgaa 

5164tcttcataagttcaatgaa 

516Sttttaacaaaag(ggaagt 

5166tggagaagcaaatctggat 

5167tgttgaagtgtctccattc 

51 68 aattcx^aattttgagaatg 

5 1 69 ctgttgaaagatttatcaa 
5170gaagttctcaatmgatt 

5171 ttcttctggaaaagggtca 

51 72 agattctcagatgagggaa 
5173fcagatggcattgctgctt 
51 74gagataaccgtgcctgaat 

51 75 attttgaaaaaaacagaaa 

51 76 ttccatcacaaatcctttg 
6 1 77 actttacttcccaactctc 
5 1 78 aactttacttcccaactct 
5179ttgattcccttmtgaga 
5180tgctgaatccaaaagattg 

5181 ggtttatcaaggggcxatt 

51 82 aggttccatcgtgcaaact 
5183tgctcGaggagaacUact 
5184aactctcaagtcaagttga 
51 85tocattctgaatatattgt 
51 86attttctgaacttccccag 
51 87 atctgatgaggaaactcag 
61 88ttcatgtccctagaaatct 
5189tcaaggataacgtgtttga 
51 90ttgatgatgctgtcaagaa 
5191 cgacgaagaaaataatttc 
51 92ttccagaaagcagccagtg 
5193cttccccaaagagaccagt 
51 94 ctgattactatgaaaaatt 

51 gsatggaaaagggaaagagaa 
51 96tgga8gtgtcagtggcaaa 

51 97 aggacctttcaaattcctg 

51 98 aaatcaggatctgagttat 
51 99tccattctgaatatattgt 
5200 tgclggaattgtcattcct 
5201 taagttctctgtacctgct 
5202ttcaaaacgagcttcagga 
5203catttgatttaagtgtaaa 



11025 


1 1044 


1 3 


■ wWV 1 


13700 


1 3 


10283 


10302 


1 3 


iAn9R 




1 3 


1 \J£,\JQ 


10997 


1 3 




1099R 


1 3 




8Q78 


1 3 


7973 


7999 


1 3 


13978 




1 3 


9588 




1 3 
1 o 


lO lOw 


13909 


1 3 




RR4fi 


1 3 

1 w 




9401 


1 3 
1 o 


QRRQ 




1 3 


lOdlil 


10433 


1 3 


19Q39 


1 9QR1 


1 3 
1 O 


8<^99 




1 3 


fl8Rd 

000*T 


8903 


1 3 


8091 


RQ40 


1 3 


1 1R19 


1 IQO 1 


1 Q 






1 O 


0738 


0767 


1 3 

1 o 




9689 


1 3 
1 o 


13410 


13499 


1 3 




13498 


1 3 

1 o 


11537 


11<>56 * 


1 3 

1 o 


13660 


13679 1 


1 3 
1 o 


19450 


19479 ' 


1 3 


11388 


11407 1 


1 3- 


13780 


1 3799 1 

1 Ot OO 


1 3 


13499 


13441 1 


1 3 
1 o 


13380 


13399 i 
i oooo 


1 3 
t O 


12709 


19791 ^ 


o 

0 


12259 


1 2978 1 


o 
o 


10038 


ionf%7 1 


4 
0 


19618 


1 9R37 1 




7308 
r OvO 


7397 1 

f O^i \ 




i3<i6R 




q 




1 ORDR 1 


q 




90i7 1 
4o1 / 1 


q 
O 


13638 


1 3657 1 


3 


13494 


13513 1 


3 


10380 


10399 1 


3 


0848 


9867 1 


3 


14038 


14057 1 


3 


13380 


13399 1 


3 


11734 


11753 1 


3 


11719 


11738 1 


3 


13206 


13225 1 


3 


9621 


9640 1 


3 
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SEQ 10 NO: 


3863 


acacaataaicacaactcc 


0534 9653SEQIDNC 


): 5204ggagacagcatcttcgtgt 


11211 


11230 


1 3 


SEQ ID NO: 


3864 


aagatnctctciatggga 


9561 8580SEQIDNC 


): 5205tcocagaaaBcctcttctt 


3936 


3955 


1 3 


SEQ ID NO: 


3S65 


gaaaaaacaggcttgaagg 


9578 9597SEQIDNC 


): 5206ccttttacaattcattttc 


13021 


13040 


1 3 


SEQ ID NO: 


3866 


itgaaggaancttgaaaa 


9590 9609SEQIDNC 


): 5207ttttgagaatgaatttcaa 


10422 


10441 


1 3 


SEQ ID NO: 


3867 


tgaaggaattcttgaaaac 


9591 9810SEQIDNC 


1: 5208gttttggctgataaattca 


11291 


11310 


1 3 


SEQ ID NO: 


3868 


agctcagtataagaaaaac 


9640 9659SEQ1DNC 


i: 5209gtttgataagtacaaagct 


9605 


9824 


1 3 


SEQ ID NO: 


3869 


icaaaicciugacaggca 


9720 9739SEQIDNC 


1; 5210tgcctgagcagaccattga 


11688 


11707 


1 3 


SEQ ID NO: 


3870 


aigaaacaaaaattaagtt 


9789 9808 SEQ ID NC 


I: 521 1 aactttgcactatgttcat 


12762 


12781 


1 3 


SEQ ID NO: 


3871 


aancciggatacactgtt 


9859 9878SEQIDNO 


i; 5212aacaca1gaatcacaaatt 


8938 


8957 


1 3 


SEQ ID NO: 


3872 


itccagugtcaatgttga 


^%n*v^% #%#vtfftM 

9876 9895SEQIDN0 


»: 521 3tcaaaaogagcttcaggaa 


13207 


1322G 


1 3 


SEQ ID NO: 


3873 


aagtgtctccattcaccat 


^ft«%tf^ ^Hrfh ^ d« 

9694 9913SEQIDNO 


5214atgggaagtataagaactt 


4842 


4861 


1 3 


SEQ ID NO: 


3874 


gtcagcatgcctagtttct 


9950 9969SEQIDN0 


52 1 5 agaaaaggcacaccttgac 


11080 


11099 


1 3 


SEQ ID NO: 


3875 


ctgccatgggcaatattac 


10113 10132SEQ ID NO 


: 52 1 6gtaagaaaatacagagcag 


6440 


6459 


1 3 


SEQ ID NO: 


3876 


tgaataccaatgctgaadt 


10167 10186SEQ ID NO 


: 521 7agttgaaggagactattca 


7224 


7243 


1 3 


SEQ ID NO: 


3877 


tattgttgctcatctcctt 


10201 10220SEQ ID NO 


521 daaggaaacataaactaata 


12889 


12908 


1 3 


SEQ ID NO: 


3878 


tgttgctcatetcctttct 


1020410223SEQ ID NO 


521 9agaagaaatctgcagaaca 


12431 


12450 


1 3 


SEQ ID NO: 


3879 


tctgtcattgatgcactgc 


10232 10251 SEQ ID NO 


5220gcagtagactataagcaga 


13928 


13947 


1 3 


SEQ ID NO: 


3880 


ccacagctctgtctctgag 


10305 10324SEQ ID NO 


5221 ctcagggatctgaaggtgg 


8195 


8214 


1 3 


SEQ ID NO: 


3881 


■rtllijul-mjj n « ■ « 

atttgtggagggtagtcat 


10330 10349SEQ ID NO 


5222atgaagtagaccaacaaat 


7161 


7180 


1 3 


SEQ ID NO: 


3882 


atatggaagtgtcagtggc 


^ rfftrfftMiMft ^ ^ft.^K^ft.^» 

10377 10398SEQ ID NO 


5223gccacactccaacgcatat 


10778 


10797 


1 3 


SEQ ID NO: 


3883 


tggaaataccaagtcaaaa 


10453 10472SEQ ID NO 


5224ttttacaattcattttcca 


13023 


13042 


t 3 


SEQ ID NO: 


3884 


aagtcaaaacctactgtct 


10463 10482SEQ ID NO 


5225agacctagtgattacactt 


12859 


12878 


1 3 


^^^^ t ft 1 

SEQ ID NO: 


3885 


aQigtctcttcctccatgg 


10475 10494SEQ ID NO 


5226ocatgcaagtcagcccagt 


10924 


10943 


1 3 


^% ^^^K. 1 ft. i 

SEQ ID NO: 


3886 


cttcctGcafggaatnaa 


104«210501sEQIDNO 


5227ttaatcgagaggtatgaag 


7148 


7167 ' 


1 3 


^^^^ ■ ft ■ 

SEQ ID NO: 


3887 


ancitcaatgctgtactc 


10512 10531 SEQ ID NO 


5228gagttgagggtccgggaat 


12242 


12261 ' 


I 3 


SEQ ID NO: 


3888 


ttg accacaagcttagctt 


10548 10567SEQ ID NO, 


5231 aagcgcacctcaatatcaa 


12036 


12055 ' 


1 3 


SEQ ID NO: 


3889 


cctca cctctiacttttcc 


10573 10592SEQ ID NO; 


5232 ggaactattgctagtgagg 


10649 


10668 ' 


1 3 


SEQ ID NO: 


3890 


agctgcagggcacttocaa 


^ #%ftVi%#^ 

10710 10720SEQ ID NO; 


5233 Hgggaagaagaggcagct 


12289 


12308 1 


1 3 


SEQ ID NO: 


3891 


nccaaaattgatgatatc 


10723 10742SEQ ID NO: 


5234gatatacactagggaggaa 


12745 


12764 1 


3 


oEQ ID NO: 


3892 


gagaacatacaagcaaagc 


10860 10879SEQ ID NO: 


5235 gcttggttttgccagtctc 


2467 


2486 1 


3 


SEQ ID NO: 


3893 


aiggcaaaigtcagctctt 


10897 10916SEQ |d NO: 


5236 aagaggtatttaaagccat 


12960 


12979 1 


3 


SEQ ID NO: 


3894 


tggcaaatgtcagctcttg 


108981 091 7sEQ ID NO: 


5237caagaggtatttaaagcca 


12959 


12978 1 


3 


SEQ ID NO: 


3895 


itgttcaggtccatgcaag 


10914 10933SEQ ID NO: 


5238cttgggggaggaggaacaa 


14066 


14085 1 


3 


SEQ ID NO: 


3896 


tgttcaggtccatgcaagt 


10915 10934SEQ ID NO: 


5239 acttgggggaggaggaaca 


14065 


14084 1 


3 


SEQ ID NO: 


3897 


agttccnccatgatttcc 


10940 10959 SEQ ID NO: 


5240ggaatctgatgaggaaact 


12256 


12276 1 


3 


SEQ ID NO; 


3898 


tgctaacactaagaaccag 


10987 11006SEQ ID NO: 


5241 ctggatgtaaccaccagca 


11186 


11205 1 


3 


SEQ ID NO: 


3899 


actaagaaccagaagatca 


10994 11013SEQ ID NO; 


5242tgatcaagaacctg1tagt 


13347 


13366 1 


3 


SEQ ID NO: 


3900 


ctaagaaccagaagatcag 


10996 11014SEQ ID NO: 


5243 ctgatcaagaacctgttag 


13348 


13365 1 


3 


SEQ ID NO: 


3901 


cagaagatcagatggaaaa 


11003 11022SEQ ID NO: 


5244ttttcagaccaactctctg 


13622 


13841 1 


3 


SEQ ID NO: 


3902 


aaaaatgaagtccggattc 


1101811037SEQIDNO: 


5245gaatttgaaagttcgtttt 


9280 


9299 1 


3 


SEQ ID NO: 


3903 


gattcattctgggtctttc 


11032 11051 SEQ ID NO: 


5246gaaaacctatgccttaatc 


13166 


13185 1 


3 


SEQ ID NO: 


3904 


aagaaaaggcacaccttga 


11079 11098 SEQ ID NO: 


5247tcaaaacctactgtctclt 


10466 


10485 1 


3 


SEQ ID NO: 


3905 


aaggacaoctaaggttcct 


1111611134SEQ ID NO: 


5248 aggacaccaaaataacctt 


7572 


7591 1 


3 


SEQ ID NO: 


3906 


ccagcattggtaggagaca 


1119911218SEQIDNO: 


5249 tgtcaacaagtaccactgg 


12370 


12389 1 


3 


SEQ ID NO: 


3907 


ctttgtgtacacx:aaaaac 


11239 11258 SEQ ID NO: 


5250gtttttaaattgftgaaag 


13148 


13167 1 


3 


SEQ ID NO: 


3908 


ccatccctgtaaaagtttt 


1127711296SEQ ID NO: 


5251 aaaagggtcatggaaatgg 


8893 


8912 1 


3 


SEQ ID NO: 


3909 


tgatctaaattcagttctt 


1133211351SEQ ID NO: 


5252 aagatagtcagtctgatca 


13334 


13353 1 


3 


SEQ ID NO: 


3910 


aagaagctgagaactteat 


11432 11451 SEQ ID NO: 


5253 atgagatcaacacaatctt 


13110 


13129 1 


3 


SEQ ID NO: 


3911 


tttgccctcaacctaccaa 


1145311472SEQIDNO: 


5254ttggtacgagttactcaaa 


12641 


12660 1 


3 


SEQ ID NO: 


3912 


cttgattcccttttttgag 


11 536 11 555 SEQ ID NO: 


5255ctcaatittgattttcaag 


8528 


8647 1 


3 
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SEQ ID NO: 


3913 


ttcacgdtccaaaaagtg 


11591 11610SEQIDNO 


5256cactcattgattttctgaa 


12693 


12712 


1 


3 

w 


SEQ ID NO: 


3914 


tgtttcagatggcattgct 


11608 11627SEQ ID NO 


5257agcagattatflttgaaaca 


11833 


11852 


i 


3 

w 


SEQ ID NO: 


3915 


aatgcagtagccaacaaga 


1163911658sPQlDNd 


5258tcttttcagcccagccatt 


9231 


9250 


1 


3 

w 


SEQ ID NO: 


3916 


ctgagcagaccattgagat 


1169111710SEQIDNO 


5259atctaa1aaaaaaactcaa 


12259 


1227fl 




3 

w 


SEQ ID NO: 


3917 


tgagcagaccattgagatt 


11692 11711 SFO in MO 

OCVj{ ILf l\\J 


5260aatctaatQaaaaaactca 


12258 


1 9977 


1 




SEQ ID NO: 


3918 


ttgagattccctccattaa 


11703 11 722SFO in WO 


5261 ttaatcttcataaattcaa 


13170 


i3igs 


i 
1 


3 

w 


SEQ ID NO: 


3919 


acttggagtgccagtttga 


1180711826sPOinNO 


5262tcaattaaoiaaaaacaaat 


6504 


6523 


1 
1 


w 


SEQ ID NO: 


3920 


caaatttgaaggacttcag 


12004 12023qpo in mo 


5263 ct cn an aadtcatcafttn 






•t 
1 


o 
o 


SEQ ID NO: 


3921 


agcccagcgttcaccgafc 


12056 12076QFO in mo 


5264 aatccaaatataattaact 


132BR 




1 


o 

w 


SEQ ID NO: 


3922 


cagcgttcaccgatctcca 


12060 12079c:fo in no 


5265taaacctcicaccaaaacta 


1 OS7vJ\/ 




1 


o 


SEQ ID NO: 


3923 


ctccatctgcgctaccaga 


12074 12093SPO ID MO 


5266tctaatatacatcaGaaaa 


13711 


1 Of WW 


i 
1 


w 


SEQ ID NO: 


3924 


atgaggaaactcagatcaa 


12264 12283SPO in no 


5267ttaaattacGcaccatcat 


11667 


11ARR 

1 luOw 


1 


4 
O 


SEQ ID NO: 


3925 


aggcagcttctggcttgct 


1230012319CCQ in MO 


5268 aacaaatctttcctaacct 




OUO/ 


1 


o 


SEQ ID NO: 


3926 


tgaaagacaacgtgcccaa 


12327 12346cFn m MO 


• 5269ttaaaaaaciacaaciittca 






i 
1 


o 
o 


SEQ ID NO: 


3927 


tatgattatgtcaacaagt 


12362 12381 SFO in NO 


5270 actttacactatattcata 


12763 


197ft9 

l^f Oc. 


1 


o 


SEQ ID NO: 


3928 


cattaggcaaattgatgat 


12475 12494SPO in NO 


5271 atcaacacaatcttcaata 


13115 

Iw 1 Iw 


1 w 1 w*r 


1 


w 


SEQ ID NO: 


3929 


ttgactcaggaaggocaag 


12584 12603SFO lO NO 


5272 cttQataca aattactcaa 


12640 


12668 

1 &WWC7 


1 


o 
o 


SEQ ID NO: 


3930 


gaaacctgggatatacact 


12736 12755SFO in NO 


5273agtqattacacttcctttc 


12865 


12RR4 






SEQ ID NO: 


3931 


tcctttcgagttaaggaaa 


12877 12896SEQ ID NO- 


5274tttctQccactQctcaaaa 


13524 

1 Ww**~ 


13543 






SEQ ID NO: 


3932 


gccattcagtctctcaaga 


12974 12993SEQ ID NO' 


5275 tcttccgttctgtaataac 


5802 


5821 


1 
1 


3 

w 


SEQ ID NO: 


3933 


gtgctacgtaatcttcagg 


13001 13020 SEQ ID NO- 


5276cctgcaccaaaactaacac 


13964 


13983 

1 Wwww 


i 


3 

w 


SEQ ID NO: 


3934 


agctgaaagagatgaaatt 


13065 13084SEQ ID NO- 


5277aatttattcaaaacaaact 


13200 


13219 


1 


3 


SEQ ID NO: 


3935 


aattlacttatcttattaa 


13080 13099SEQ |D NO- 


5278 ttaaaagaaatcttcaatt 


13816 


13834 

1 WwW~ 


1 

1 


3 


SEQ ID NO: 


3936 


ttttaaattgttgaaagaa 


13150 13169SEQ ID NO- 


5279ttctctctatQQqaaaaaa 


9566 

w vW 


9585 

owuw 


i 


3 

W 


SEQ ID NO: 


3937 


taatcttcataagtfcaat 


131801 31 99sEQ ID NO* 


5280 attgaaattccctccatta 


11702 

III Vmm 


11721 

1 1 r <■ 1 


i 
1 


o 


SEQ ID NO: 


3938 


atattttgatccaagtata 


13279 13298 SEQ ID NO- 


5281 tataagcaaaaocacatat 


13937 

1 www* 


13956 


1 


3 


SEQ ID NO: 


3939 


tgaaatattatgaacttga 


13311 13330SEQ ID NO: 


5282tcaaccttaatgattttca 


8295 


8314 


1 


3 


SEQ ID NO: 


3940 


caatttctgcacagaaata 


13442 13461 SEQ ID NO: 


5283tattcttcttttccaattg 


13834 


13853 


1 


3 


SEQ ID NO: 


3941 


agaagattgcagagctttc 


13509 13528SEQ ID NO: 


5284 gaaatcttcaatttattct 


13821 


13840 


1 


3 


SEQ ID NO: 


3942 


gaagaaaataatttctgat 


13570 13589SEQ ID NO: 


5285atcagttcagataaacttc 


7999 


8018 


1 


3 


SEQ ID NO: 


3943 


ttgacctgtccattcaaaa 


13680 13699SEQ ID NO: 


5286ttttgagaatgaatttcaa 


10422 


10441 


1 


3 


SEQ ID NO: 


3944 


tcaaaactaccacacattt 


13693 13712SEQ ID NO: 


5287aaattccttgacatgttga 


7370 


7389 


1 


3 


SEQ ID NO: 


3945 


ttttttaaaagaaatctfc 


13611 13830SEQ ID NO: 


5288 gaagtgtcagtggcaaaaa 


10382 


10401 


1 


3 


SEQ ID NO: 


3946 


aggatctgagttattttgc 


14043 14062SEQ ID NO: 


5289gcaagggttcactgttcd 


7864 


7883 


1 


3 


SEQ ID NO: 


3947 


tttgctaaacttgggggag 


14067 14076SEQ ID NO: 


5290ctGCccaggacctttcaaa 


9842 


9861 


1 


3 
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Table 10. Selected palindromic sequences from human gliLcose-6-ph osphatase 





Source 


Start End 
Inde^lndGK 


Match 


Start End 
Indeic Index 


# 


B 


SEQ ID NO: 


5291 tccatcttcaggaagctgt 


222 


241 SEQ ID NO: 


5369acagactctttcagatgga 


1340 


1359 


1 


6 


SEQ ID NO: 


5292 ccatcttcaggaagctgtg 


223 


242SEQ ID NO: 


5370cacagactctttcagatgg 


1339 


1358 


1 


6 


SEQ ID NO: 


5293 cctctggccatgccatggg 


417 


436SEQ ID NO: 


5371 cccattttgaggccagagg 


1492 


A e A A 

1511 


1 


6 


SEQ ID NO: 


5294 ctctggccatgccatgggc 


418 


437SEQ ID NO: 


5372gcccattttgaggccagag 


A At\A 

1491 


A CA f\ 

1510 


1 


6 


SEQ ID NO: 


5295ttgaatgtcattttgtggt 


521 


540SEQ ID NO: 


5373accatacattatcattcaa 


2945 


2964 


1 


6 


SEQ ID NO: 


5296tcagtaatgggggaccagc 


1886 


1905SEQ ID NO: 


5374gctggtctogaactcctga 


2731 


2750 


1 


6 


SEQ ID NO: 


5297 ttttactgtgcatacatgt 


1968 


1975SEQ ID NO: 


5375acatctttgaaaagaaaaa 


2983 


3002 


1 


6 


SEQ ID NO: 


5298tgaggtgccaaggaaatga 


50 


69SEQ ID NO: 


5376tcatgtctcagcctcctca 


2620 


2639 


1 


5 


SEQ ID NO: 


5299gaggtgccaaggaaatgag 


51 


70SEQ ID NO: 


5377ctcatgtctcagcctcctc 


no A f\ 

2619 


2638 


1 


5 


SEQ ID NO: 


5300gggaaagataaagccgacc 


487 


506SEQ ID NO: 


5378 ggtcgcctggcttattocc 


1295 


1314 


1 


5 


SEQ ID NO: 


5301 Itttcctcatcaagttgtt 


598 


617SEQ ID NO: 


5379aacatctttgaaaagaaaa 


2982 


3001 


1 


5 


SEQ ID NO: 


5302 ctttcagccacatccacag 


651 


670SEQ ID NO: 


5380ctgtggactctggagaaag 


773 


792 


1 


5 


SEQ ID NO: 


5303 tggactctggagaaagccc 


776 


795SEQ ID NO: 


5381 gggctggctctcaactcca 


884 


903 


1 


5 


SEQ ID NO: 


5304agcctcctcaagaacctgg 


848 


867SEQ ID NO: 


5382ccagattcttccactggct 


2107 


2126 


1 


5 


SEQ ID NO: 


530599cctggggctggctctca 


878 


^^^SEQ ID NO: 


5383tgagccaccocaccgggcc 


2801 


2820 


1 


5 


SEQ ID NO: 


5306gagctcactcccactggaa 


1439 


1458SEQ ID NO: 


5384ttccaggtagggccagctc 


1676 


1695 


1 


5 


SEQ ID NO: 


5307agctaatgaagctattgag 


1572 


1591 SEQ ID NO: 


5385ctcagcctcctcagtagct 


2626 


2645 


1 


5 


SEQ ID NO: 


5308 gctaatgaagctattgaga 


1573 


1592 SEQ ID NO; 


5386 tctcagcctcctcagtagc 


2625 


2644 


1 


5 


SEQ ID NO: 


5309 ctaaatggctttaattata 


1854 


1873 SEQ ID NO: 


5387tatatttttagaattttag 


2683 


2702 


1 


5 


SEQ ID NO: 


5310ctgcttttctttttttttc 


2509 


2528 SEQ ID NO: 


5388 gaaaaatatatatgtgcag 


2996 


3016 


1 


5 


SEQ ID NO: 


531 1 caatcaccaccaagcctgg 


0 


19SEQ ID NO: 


6389 ccagaatgggtccacattg 


812 


831 


1 


4 


SEQ ID NO: 


531 2agcctggaataactgcaag 


12 


31 SEQ ID NO: 


5390 cttggatttctgaatggct 


1987 


2006 


1 


4 


SEQ ID NO: 


531 Sgttccatcttcaggaagct 


220 


239 SEQ ID NO 


5391 agctcactcccactggaac 


. 1440 


1459 


1 


4 


SEQ ID NO: 


531 4tggtgggttttggatactg 


326 


345 SEQ ID NO 


5392 cagtcctccca ccctacca 


2425 


2444 


1 


4 


SEQ ID NO: 


531 5acctgtgagactggaccag 


392 


411 SEQ ID NO: 


5393ctggagaaagcccagaggt 


782 


801 


1 


4 


SEQ ID NO: 


531 egctgttacagaaactttca 


638 


657 SEQ ID NO: 


5394tgaatggtcttctgccagc 


1474 


1493 


1 


4 


SEQ ID NO 


531 7acagcatctataatgccag 


666 


685SEQ ID NO 


5395 ctgg g tgtag acctcctgt 


758 


777 


1 


4 


SEQ ID NO 


531 Sgggtgtagacctcctgtgg 


760 


779 SEQ ID NO 


5396 ccacattgacaccacaccc 


823 


842 


1 


4 


SEQ ID NO 


531 9 ggtgtagacctcctgtgga 


761 


780 SEQ ID NO 


5397 tccacattgacaccacacc 


822 


841 


1 


4 


SEQ ID NO 


5320gtgtagacctcctgtggac 


762 


781 SEQ ID NO 


5398 gtccacattgacaccacac 


821 


840 


1 


4 


SEQ ID NO 


5321 gacctcctgtggactctgg 


767 


788 SEQ ID NO 


5399 ccagatattgcactaggtc 


nr\A A 

2014 


2033 


1 


4 


SEQ ID NO 


5322 cctgggcacgotctttggc 


862 


881 SEQ ID NO 


5400 gccagctcacaagcccagg 


1687 


1706 


1 


4 


SEQ ID NO 


5323 ctgggcacgclcttlggcc 


863 


882 SEQ ID NO 


540 1 ggccagctcacaagcccag 


1686 


1705 


1 


4 


SEQ ID NO 


5324ctggtcttctdcgtcttgt 


1028 


1047 SEQ ID NO 


5402acaaaagcaagacttccag 


1663 


1682 


1 


4 


SEQ ID NO 


: 5325agagtgcggtagtgcccct 


1056 


1075SEQ ID NO 


5403 agggccaggattcctctct 


2229 


2248 


1 


4 


SEQ ID NO 


: 5326tgggcactggtatttggag 


1217 


1236SEQIDNO 


5404ctcccactggaacagccca 


1446 


1465 


1 


4 


SEQ ID NO 


5327 gaattaaatcacggatggc 


1267 


1286SEQ ID NO 


: 5405 gccaaccaagagcacattc 


2311 


2330 


1 


4 


SEQ ID NO 


5328tgttgctagaagttgggtt 


1598 


161 7 SEQ ID NO 


: 5406 aaccatcctgctcataaca 


2967 


2986 


1 


4 


SEQ ID NO 


5329 aggagctctgaatctgata 


1764 


1783SEQ ID NO 


5407tatcacattacatcatcct 


2063 


2082 


1 


4 


SEQ ID NO 


: 5330taaatggctttaattatat 


1855 


1874SEQ ID NO 


5408atatatgtgcagtatttta 


3003 


3022 


1 


4 


SEQ ID NO 


: 5331 aaaatgacaaggggagggc 


2215 


2234 SEQ ID NO 


: 5409 gccctccttgcctgtlttt 


2817 


2836 


1 


4 


SEQ ID NO 


5332 ttaaaggaaaagtcaacat 


2330 


2349 SEQ ID NO 


: 541 0 atgtgcagtattttattaa 


3007 


3026 


1 


4 


SEQ ID NO 


5333acatottctctcttttttt 


2345 


2364SEQ ID NO 


541 1 aaaagaaaaatatatatgt 


2992 


3011 


1 


4 


SEQ ID NO 


: 5334ttctacgtcctcttcccca 


197 


216SEQ ID NO 


5412tgggccagccgcacaagaa 


1116 


1135 


1 


3 


SEQ ID NO 


5335 tgggtagctgtgattggag 


257 


276 SEQ ID NO 


: 54 1 3 ctcccactggaacagccca 


1446 


1465 


1 


3 
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SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 



5336 gctgtgattggagactggc 263 

5337 cadtccgtgcccctgata 356 

5338 acatctactctttcx^lct 464 

5339 ctactctttccatctttca 468 

5340 agataaagccgacctacag 492 

5341 tgtgcagctgaatgtctgt 563 

5342 atgtctgtctgtcacgaat 564 

5343clgtcacgaatctaccttg 572 

5344 atcaagttgttgctggagt 606 

5345cagaaactttcagccacat 645 

5346actttcagccacatccaca 650 

5347 atgccagcctcaaga aata 678 

5348agaaatattttctcattac 690 

5349gaaatattttctcattacc 691 

5350tgctgctcaagggactggg 744 

5351 cctgtggactctggagaaa 772 

53529g899939^^^9®9g*9g 784 

5353 ttgaaacccccatcccaag 1 004 

5354cagatggaggtgccatatc 1 351 

5355ggagctcactcccactgga 1 438 

5356ttgggtaatgtttttgaaa 1 653 

5357gaagUgggttgttctgga 1606 

5358aaaagaaggctgcctaagg 1 785 

5359 aaagaaggctgcctaagga 1 786 

5360 aagaaggctgcctaaggag 1 787 

5361 agaaggctgcctaaggagg 1 788 

6362atttccttggatttctgaa 1 982 

5363tccttataagcccagctct 2081 

5364 aiaagcccagctctgcttt 2086 

5365 ggccaggattcctctclca 2231 

5366gccaactcctccttgcctg 2493 

5367ttttttttctttttttgag 251 9 

5368 ccggcgtgcaccaccatgc 2652 



282SEQ ID NO 
377 SEQ ID NO 
483 SEQ ID NO 
487SEQ ID NO 
511 SEQ ID NO 
572SEQ ID NO 
583SEQ ID NO 
591 SEQ ID NO 
625SEQ ID NO 
664SEQ ID NO 
669SEQ ID NO 
697SEQ ID NO 
709SEQ ID NO 
710SEQ ID NO 

763sEQ ID NO 
791 SEQ ID NO 

803SEQ ID NO 
1023 SEQ ID NO 
1370 SEQ ID NO 
1457 SEQ ID NO 
1572 SEQ ID NO 
1625SEQ ID NO 
1804 SEQ ID NO 
1805SEQ ID NO 
1806 SEQ ID NO 
1807 SEQ ID NO 
2001 SEQ ID NO 
2100SEQ ID NO 
2105SEQ ID NO 
2250 SEQ ID NO 
2512SEQ ID NO 
2538 SEQ ID NO 
2671 SEQ ID NO 



5414occatgccatgggcacagc 423 442 1 3 

5415tatcacccaggGtggagtg 2548 2567 1 3 

5416agatgggatttcatcatgt 2705 2724 1 3 

5417tgaatactctcacaagtag 1419 1438 1 3 

5418ctgtttttcaatctcatct 2828 2847 1 3 

5419acagaaactttcagccaca 644 663 1 3 

5420attcaggtatagctgacat 2038 2057 1 3 

5421 caaggtgctaggattacag 2779 2798 1 3 

5422actcctgacctcaagtgat 2742 2761 1 3 

5423atgtttcaattaggctctg 2185 2204 1 3 

5424tgtggcgtatcatgcaagt 1818 1837 1 3 

5425tattttttttactgtgcat 1950 1969 1 3 

5426gtaaatatgactcctttct 2283 2302 1 3 

5427ggtaaatatgactcctttc 2282 2301 1 3 

5428cccaagccaaccaagagca 2306 2325 1 3 

5429tttcatcatgttggccagg 2713 2732 1 3 

5430ccaccgcaccgggccctcc 2805 2824 1 3 

5431cttgaattcctgggctcaa . 2405 2424 1 3 

5432gatatgcagagtatttctg 2847 2866 1 3 

5433tccacctgccttggcctcc • 2760 2779 1 3 

5434tttctctatcccaagccaa 2297 2316 1 3 

5435tccaccccactggatcttc . 2131 2150 1 3 

5436ccttgcctgcttttctttt 2503 2522 1 3 

5437tccttgcctgcttttcttt • 2502 2521 1 3 

5438ctccttgcctgcttttctt 2501 2520 1 3 

5439octccttgcctgcttttct 2500 2519 1 3 

5440ttcaattaggctctgaaat 2189 2208 1 3 

5441agagcacattcttaaagga 2319 2338 1 3 

5442aaagctgaagcctatttat 2889 2908 1 3 

5443tgagccaccgcaccgggcc 2801 2620 1 3 

5444caggctggagtggagtggc 2555 2574 1 3 

5445ctcataacatcttlgaaaa 2977 2996 1 3 

5446gcatgagccaccgcaccgg 2798 2817 1 3 
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Table 11. Selected palindromic sequences from rat glucose-6-phosphatase 





Source 


Start End 
index Index 


Match 


Start End 
Index Index 


# 


B 


SEQ ID NO: 


5447 ctgactattacagcaacag 


301 


320SEQ ID NO: 


5471 ctgtggctgaaactttcag 


o9o 


Dl f 


1 


6 


SEQ ID NO: 


5448 ctcttggggttggggctgg 


831 


850SEQ ID NO: 


5472 ccagcatgtaccgcaagag 




o7o 


1 


6 


SEQ ID NO: 


5449tgcaaaggagaactgcgca 


. 879 


89BSEQ ID NO: 


5473tgcgaccgtcccctttgca 


luiy 


lUoo 


1 


6 


SEQ ID NO: 


5450 cctcgggccatgccatggg 


376 


395SEQ ID NO 


5474 cccagtgtggggccagagg 




•1 •lOA 

nyu 


1 


5 


SEQ ID NO: 


5451 ttgagcaaaccatatgcaa 


1478 


1497SEQ ID NO 


5475 ttgcagagtgtgtcttcaa 


Z\JDf 




1 


5 


SEQ ID NO: 


5452 cagcttcctgaggtaccaa 


2 


21 SEQ ID NO: 


5476 ttggtgtctgtgatcgctg 


A OQ 
\£.0 


14id 


1 


4 


SEQ ID NO: 


5453 ggtaccaaggaggaaggat 


13 


32SEQ ID NO; 


5477atccagtcgactcgctacc 


DO 




1 


4 


SEQ ID NO: 


5454 ctccacgactttgggatcc 


51 


70SEQ ID NO 


5478 ggatcgggaggagggggag 


AAAti 

144o 




1 


4 


SEQ ID NO: 


5455 caggactggtttgtcttgg 


108 


127SEQ ID NO: 


5479 ccaagcccgactgtgcctg 


2018 


2037 


1 


4 


SEQ ID NO: 


5456Cttctatgtcctctttccc 


155 


174SEQ ID NO: 


5480gggacagacacacaagaag 


A f\~rcs. 


1090 


1 


4 


SEQ ID NO: 


5457 ttctatgtcctctttccca 


156 


175sEQ ID NO 


5481 tgggacagacacacaagaa 


1075 


1094 


1 


4 


SEQ ID NO- 


5458tggttccacattcaagaga 


177 


196 SEQ ID NO, 


5482 tctcaataatgatagacx:a 


1549 


1568 


1 


4 


SEQ ID NO 


5459 tgcctctgataaaacagtt 


325 


344SEQ ID NO: 


5483 aactctgagatcttgggca 


1868 


1887 


1 


4 


SEQ ID NO 


5460agcccggctcctgggacag 


1064 


1083SEQ ID NO 


5484 ctgtcctccagcctgggct 


2034 


2053 


1 


4 


SEQ ID NO 


5461 agtclctgacacaagtcag 


1111 


1130SEQ ID NO. 


5485ctgaatggtaatggtgact 


1659 


1678 


1 


4 


SEQ ID NO 


5462 aaaaaggtgaatttttaaa 


1237 


1256SEQ ID NO 


5486 tttattaaaacgacatttt 


2201 


2220 


1 


4 


SEQ ID NO 


5463 acactctcaataatgatag 


1645 


1564SEQ ID NO: 


5487 ctatgaatgatgcctgtgt 


2121 


2140 


1 


4 


SEQ ID NO 


5464aaagaatgaacgtgctcca 


37 


56 SEQ ID NO 


5488 tggacctcctgtggacttt 


724 


743 


1 


3 


SEQ ID NO 


5465ctttgggatccagtcgact 


59 


78SEQ ID NO 


5489 agtcagcggccgtgcaaag 


1124 


1143 


1 


3 


SEQ ID NO 


5466 gtgatcgctgacctcagga 


132 


151 SEQ ID NO 


5490 tcctctctccaaaggtcac 


1911 


1930 


1 


3 


SEQ ID NO 


5467 ggaacgccttctatgtcct 


148 


167 SEQ ID NO 


549 1 aggactcatcactgcttcc 


1748 


1767 


1 


3 


SEQ ID NO 


5468gactgtgggcatcaatctc 


194 


213SEQ ID NO 


5492gagactggaccagggagtc 


357 


376 


1 


3 


SEQ ID NO 


5469 ggacactgactattacagc 


296 


315SEQ ID NO 


5493 gctgaacgtctgtctgtcc 


518 


537 


1 


3 


SEQ ID NO 


5470 aagcccccgtcccagattg 


966 


985SEQ ID NO 


5494 caattgtttgctggtgctt 


1833 


1852 


1 


3 
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Table 12. Selected palindromic sequences from human B-catenin 

Source Start End Match Start End # B 

Index Index Index Index 

SEQIDNO: 5495agcagcttcagtccccgcc 70 89SEQIDNO: 5542ggcgacatatgcagctgct 2152 2171 1 5 

SEQIDNO: 5496ccattctggtgccactacc 304 323SEQIDNO: 6543Qgtatggaccccatgalgg 2387 2406 1 5 

SEQIDNO: 5497tccttctctgagtggtaaa 328 347SEQIDNO: 5544tltattacatcaagaagga 985 1004 1 5 

SEQIDNO: 5498tctgagtggtaaaggcaat 334 353SEQ ID NO: 5545attgtacgtaccatgcaga 791 810 1 5 

SEQIDNO: 5499cagagggtacgagctgcta 473 492SEQIDNO: 5546tagctgcaggggtcctctg 2037 2056 1 5 

SEQIDNO: 5500ctaaatgacgaggaccagg 677 696SEQIDNO: 5547cctglaaatcatcctttag 2539 2558 1 5 

SEQIDNO: 5501 taaatgacgaggaccaggt 678 697SEQIDNO: 5648acctgtaaatcatccttta 2538 2557 1 5 

SEQIDNO: 5502glcctgtatgagtgggaac 383 402SEQ ID NO: 5549gtlccgaatgtctgaggac 2176 2195 2 4 

SEQIDNO: 5503cccagcgccgtacgtccat 1839 1868SEQ ID NO: 5550atgggctgccagatctggg 2451 2470 2 4 

SEQIDNO: 5504tcccctgagggtatttgaa 143 162SEQIDNO: 5551 ttcacatcctagctcggga 1929 1948 1 4 

SEQIDNO: 5505gggtatttgaagtatacca 151 170SEQIDNO: 6652tggttaagctcttacaccc 1680 1699 1 4 

SEQIDNO: 5506gctgttagtcactggcagc 260 279seqidNO: 5553gctgcctccaggtgacagc 2494 2513 1 4 

SEQIDNO: 5507gtcctgtatgagtgggaac 383 402SEQIDNO: 5554gttcgccttcactatggac. 1652 1671 1 4 

SEQIDNO: 5508tcctgtatgagtgggaaca 384 403SEQIDNO: 6555tgttccgaatgtctgagga 2175 2194 1 4 

SEQIDNO: 5509gtalgcaatgactcgagct 454 473SEQIDNO:' 5558agctggcctggtttgatac 2517 2536 1 4 

SEQIDNO: 5510gtccagcgtttggctgaac 563 582SEQIDNO: 5557gttcgccttcactatggac 1652 1671 1 4 

SEQIDNO: 551 1 tatcaagatgatgcagaac • 623 642SEQIDNO: 5558gttcgtgcacatcaggata 1820 1839 1 4 

SEQIDNO: 5512tatggtccatcagctttct 718 737sEQ ID NO: 5559agaaagcaagctcatcala 1126 1145 1 4 

SEQIDNO: 5513ccctgglgaaaatgcttgg 915 934sEQ ID NO: 5560ccaaagagtagctgcaggg 2029 2048 1 4 

SEQIDNO: 6514agctttaggacttcacctg 1291 1310SEQIDNO: 5561 caggtgacagcaatcagct 2502 2521 1 4 

SEQIDNO: 5515ggaatctttcagatgctgc 1356 1375SEQ ID NO: 5562gcagctgctgttttgttcc 2162 2181 1 4 

SEQIDNO- 5516tgtccttcgggclggtgac 1549 I5683EQ |d NO: 5563gtcatctgaccagccgaca 1605 1624 1 4 

SEQIDNO: 5517cacagctcctctgacagag 2107 2126SEQ ID NO: 5564ctctaggaatgaaggtgtg 2134 2153 1 4 

SEQIDNO: 5518ccagacagaaaagcggctg 245 264sEQIDNO: 5565cagctcgttgtaccgclgg 828 847 2 3 

SEQIDNO: 5519cagcagcgttggcccggcc 4 23sEQIDNO: 5566ggccaccaccctggtgctg 2420 2439 1 3 

SEQIDNO: 5520aggtctgaggagcagcttc 60 79SEQIDNO: 5567gaagaggatgtggatacct 359 378 1 3 

SEQIDNO: 5521 actgttltgaaaatccagc 174 193SEQIDNO: 5568gctgatattgatggacagt 437 456 1 3 

SEQIDNO: 5522ctgatttgatggagttgga 213 232seq!DN0: 5569tccaggtgacagcaatcag 2600 2519 1 3 

SEQIDNO: 5523ccagacagaaaagcggctg 245 264seqidN0: 5570cagcaacagtcttacctgg 275 294 1 3 

SEQIDNO: 5524acagctccttctctgagtg 323 342SEQIDNO: 5571 cactgagcctgccatctgt 1579 1598 1 3 

SEQIDNO: 5525tggatacctcccaagtcct 369 388SEQIDNO: 5572aggactaaataccattcca 1972 1991 1 3 

SEQIDNO: 5526tcaagaacaagtagctgat 424 443SEQIDNO: 5573atcagctggcctggtttga 2514 2533 1 3 

SEQIDNO: 5527agctcagagggtacgagct 469 488SEQ ID NO: 5574 agctggtggaatgcaagct . 1276 1295 1 3 

SEQIDNO: 5528gcatgcagatcccatctac 516 535SEQIDNO: 5575gtagaagctggtggaatgc 1271 1290 1 3 

SEQIDNO: 5529ccacacgtgcaatccctga 645 664SEQIDNO: 5576tcagatgatataaatgtgg 1430 1449 1 3 

SEQIDNO: 5530cacacgtgcaatccctgaa 646 665SEQIDNO: 5577ltcagatgatataaatgtg 1429 1448 1 3 

SEQIDNO: 5531ggaccttgcataacctttc 846 865SEQIDNO: 5578gaaatcttgccclttgtcc 1743 1762 1 3 

SEQIDNO: 5532ctccacaaccttttattac 974 993SEQIDNO: 5579gtaaatcatcctttaggag 2542 2561 1 3 

SEQIDNO: 5533cagagtgctgaagglgcta 1222 1241 SEQ ID NO: 5580iagctgcaggggtcclclg 2037 2056 1 3 

SEQIDNO: 5534ggactctcaggaatctttc 1347 1366SEQ ID NO: 5581 gaaatcttgccctltgtcc 1743 1762 1 3 

SEQIDNO: 5535tgatataaatgtggtcacc 1435 1454sEQID NO: 5582 ggtgacagggaagacatca 1562 1581 1 3 

SEQIDNO: 5536cccagcgccgtacgtccat 1839 1858SEQ ID NO: 5583atggccaggatgccttggg 2370 2389 1 3 

SEQIDNO: 5537gtccatgggtgggacacag 1852 1871 SEQ ID NO: 5584ctgtgaacttgctcaggac 2053 2072 1 3 

SEQIDNO: 5538ttgtaccggagcccttcac 1915 1934SEQ ID NO; 5585gtgaacttgctcaggaGaa 2065 2074 1 3 

SEQIDNO: 5539ttgttatcagaggactaaa 1962 1981SEQIDNO; 5686tttaggagtaacaatacaa 2553 2572 1 3 
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SEQIDNO: 5540gaagctattgaagctgagg 2084 2103SEQ ID NO: 5587cctctgacagagttacttc 2114 2133 
SEQ ID NO: 5541tcagaacagagccaatggc 2247 2266seq ID NO: 5588gccaccaccctggtgctga 2421 2440 
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Table 13. Selected palindromic sequences from human hepatit is C virus CECV) 



SEQIDNO: 5589 
SEQIDNO: 5590 
SEQIDNO: ^591 
SEQIDNO: 5592 
SEQIDNO: 5593 
SEQIDNO: 5594 
SEQ ID NO: 5595 
SEQ ID NO: 5596 
SEQIDNO: 5597 
SEQIDNO: 5598 
SEQIDNO: 5599 
SEQIDNO: 5600 
SEQIDNO: 5601 
SEQIDNO: 5602 
SEQ ID NO: 5603 
SEQIDNO: 5604 
SEQIDNO: 5606 
SEQIDNO: 5606 
SEQ ID NO: 5607 
SEQIDNO: 5608 
SEQIDNO: 5609 
SEQIDNO: 5610 
SEQIDNO: 5611 
SEQIDNO: 5612 
SEQIDNO: 5613 
SEQIDNO: 5614 
SEQIDNO: 6615 
SEQIDNO: 5616 
SEQIDNO: 5617 
SFO ID NO* 5618 
SEQIDNO: 5619 
SEQIDNO: 5620 
SEQIDNO: 5621 
SEQIDNO: 5622 
SEQ ID NO: 5623 
SEQIDNO: 5624 
SEQIDNO: 5625 
SEQIDNO: 5626 
SEQ ID NO: 5627 
SEQIDNO: 5628 
SEQIDNO: 5629 
SEQIDNO: 5630 


Source 


start 


End 








Oldl I 

Index 


PnH 

Index 


tt 


B 




5314 


5333 


SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 


6135 


taccatcacccagctgctg 


6196 


6215 




9 


aoCiCyiucyyaiyv/UV/yy 


1682 


1701 


6136 


ccgggcagcgggtcgagtt 


8202 


8221 


-1 


8 


cdctactoaataQCQCtCd 


1049 


1068 


6137 


tgagagcgacgccgcagcg 


6151 


6170 


1 


7 


ciCC99°^^^^^^9^^cg 






6138 


cggcatgtgggcccgggag 


6053 


6072 




7 


[giaacaicgg99999^c;g 


704.8 


2067 


6139 


cgacccctcccacattaca 


6871 


6890 




7 


9 laacaicgy yy y y y iwy y 


2046 


2068 


6140 


ccgacccctcccacattac 


6870 


6889 




7 


□agcGoGCoaytrciyyuyya 


'^556 


5575 


6141 


tccggctggttcgttgctg 


9254 


9273 




7 


SLuaCCaCUUaycuatrciiAA/ 


5744 


5763 


6142 


gggtgtgcacggtgttgag 


6291 


6310 


'J 


7 


ccaccctfaccatcaccca 


6189 


6208 


6143 


tgggcgctggtatcgctgg 


5832 


5851 


1 


7 


ctaca ccata ttccQQCtc 


6249 


6268 


6144 


gagcccgaaccggacgtag 


6830 


6849 


1 


7 


hsirn rmtnttecaactca 

ici\/y vy y V wy 


625C 


6269 


6145 


pgagcccgaaccggacgta 


6829 


6848 


-} 


, 7 


aaattcctaataaaaacct 


8216 


8235 


6146 


aggctatgactaggtactc ' 


6634 


8653 


1 


. 7 


atnocQaaaaactaoQcta 


1430 


1449 


6147 


fagcgcattttcactccat 


9019 


9038 




6 


aaccaaacotaacaccaa c 


370 


389 


6148|gttgccgctaccttaggtt 


4115 


4134 


1 


6 


□atcatcaaatcattaata 

yy lyy ^vayok^/y iiyy fcy 


419 


438 


6149 


caccagcccgctcaccacc 


5734 


6753 


1 


: 6 


ccngycccv^iv.riaiyyL'Gi 


584 


603 


6150 


tgccaacgtgggtacaagg 


6374 


6393 




6 


laCCCCygccacycyicciy 




1284 


; 6151 


ctgacgactagctgcggta 


8465 


8484 




6 


aaacacQctQCCCQCctca 

\A M n vwl ^ *M w w w^ w w* w^*» 


1508 


1527 


: 6152 


tgagacgacgaccgtgccc 


4759 


4778 


1 


6 


ctgcaatgactccctccag 


1624 


1643 


: 6153 


ctggtggccctcaatgcag 


2594 


2613 




6 


aacjcgatcgtctcggcaac 


1897 


1916 


: 6154 


gttgccgctaccttaggtt 


4115 


4134 




6 


Stgcggggcccccccgtgt 


2032 


2051 


: 6155 


acaccacgggcccctgcac 


6537 


6556 




6 


atataaaaaacotaaaaca 

^ ly »y y y y y y ^y *y y y **** 


2238 


2257 


: 6156 


tgctcaatgtcctacacat 


7610 


7629 




6 


ggsgagcgttgcaacttgg 


2288 


2307 


: 6157 


ccaagctcaaactcactcc 


9207 


9226 




6 


cgtccgttgccggagcgca 


2613 


2632 


; 6158 


tgcgagcccgaaccggacg 


6827 


6846 


1 


6 


atctaacattattaacctt 


2817 


2836 


: 6159 


aaggtcacctttgacagac 


7763 


7782 


^ 


. 6 


tctttaatatcaccaaact 


2997 


3016 


: 6160 


agttcgatgaaatggaaga 


5454 


5473 


1 


6 


cttctgattgccatactcg 


3014 


3033 


: 6161 


cgagcaattcaagcagaag 


5518 


6537 


1 


6 


gcggcgtgtggggacatca 


3314 


3333 


: 6162 


tgatcacgccatgcgccgc 


7641 


7660 


1 


6 


gggacatcatcctgggcct 


3324 


3343 


: 6163 


aggcggtggattttgtccc 


3919 


3934 


1 


6 


gggcgtcttccgggccgct 


3874 


3893 


. 6164 


agcggcacggcgaccgccc 


7439 


7458 


1 


6 


ggcgtcttccgggccgctg 


3875 


3894 


. 6165 


cagcggcacggcgaccgcc 


7438 


7457 




6 


gcgtcttccgggccgctgt 


3876 


3895 


: 6166 


^caggtgccctgatcacgc 


7631 


7650 




6 


gtccccggtcttcacagac 


3961 


3980 


: 6167 


gtcttggaagaacccggac 


7252 


7271 




6 


catcaggactggggtaagg 


4174 


4193 


: 6168 


ccttcctcaagccgtgatg 


8155 


8174 




6 


ccgacggtggttgctccgg 


4245 


4264 


; 6169 


ccgggggaacggccctcgg 


4853 


4872 




6 


ggggggaaggcacctcatt 


4501 


4520 


: 6170 


aatgttgtgacttggcccc 


8334 


8353 




6 


ccgagcaattcaagcagaa 


5617 


5536 


: 6171 


ttctgattgccatactcgg 


3015 


3034 




6 


agatgaaggcaaaggcgtc 


7821 


7840 


: 6172 


gacgaccttgtcgttatci 


8564 


8583 




6 


cccctagggggcgctgcca 


767 


786 


: 6173 


tggccggcgccccccgggg 


3674 


3693 


3 


5 


ctcccggcctagttggggc 


64.6 


665 


: 6174 


gcccccccttgagggggag 


7519 


7538 


2 


5 


ttccgctcgtcggcggccc 


75G 


769 


. 6175 


gggcaaaggacgtccggaa 


7923 


7942 


2 


5 


Cccctagggggcgctgcca 


767 


786 


: 6176 


tggcgggggcccactgggg 


1383 


1402 


2 


5 


SEQIDNO: 5631 


Igccccgccggcatgcgaca 


1222 


1241 


; 6177 


tgtcccagggggggagggc 


9147 


9166 


2 


5 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



5632|aggacgaccgggtcctttc 



5633 ggacgaccgggtcctttot 



5634 aaaaccaaacgtaacacca 



5635 caaccgccgcxxjacaggac 



5636 cggtggtcagatcgttggt 



5637 acctgttgccgcgcagggg 



5638 tgccgcgcaggggccccag 



5639 gggccccaggttgggtgtg 



5640 gttggggccccacggaocc 
5641 



5642 
5643 
5644 
5645 



cctcacatgcggcctcgcc 



cacatgcggcctcgccgac 



tggggccccacggacccc 



ggggccccacggaccccc 



ccgctcgtcggcggcccc 



5646 ggcgctgccagggccttgg 



5647 ccatgtcacgaacgactgc 



5648 gtgccctgcgttcgggagg 



5649 



tgccctgcgttcgggaggg 



5650 gcx^ctgcgttcgggagggt 



5651 aggaatgctaccatcccca 



5652 tccccactacgacaatacg 



5653 atacgacaccacgtcgatt 



5654 atttgctcgttggggcggc 



5655 ccttctcgccccgccggca 



5656acx:ccggccacgcgtcagg 



5657 gcxctcgtagtgtcgcagt 



5658 gccgtctcagagaatccag 



5659 ctgaactgcaatgactccc 



5660 agactgggtttcttgccgc 
5661 tcgtccggatgcccggagc 



5662 ccagggatggggtcctatc 



5663 gacaaccgatcgtctcggc 



5664 caagacgtgcggggccccc 



5665 acgtgcggggcccccocgt 



5666 cxjggaagcaccccgaggcc 



5667 aggccacgtactcaaaatg 



5668 tgtatgtggggggcgtgga 



5669 gagtggcaggttctgccct 



5670 tcctttgcaatcaaatggg 

5671 agccx:aggccgaggccgcc 



5672 ggcggcatatgctttctat 



5673 gcggcatatgctttctatg 



5674 cggcatatgctttctatgg 



5675 tgcatgtgtgggttccccc 



5676 cccxxxtcaacgtccgggg 



178 



179 198SEQIDNO 



368 



385 



418 



444 



450 



460 



657 



658 



659 



715 



718 



751 



776 



943 



1019 



1020 



102 



1085 



1098 



1112 



1128 



1215 



1266 



1331 



1558 



1619 



1641 



1685 



1738 



1894 



2026 



2030 



2101 



2115 



2235 



2354 



2474 



2566 



2698 



2699 



2700 



2913 



2928 



5677 gggcaggggtggcgactcc 



5678 atgttggactgtctaccat 
5679|tgttggactgtctaccatg 



3401 



3574 



3575 



197|SEQ ID NO 



387 SEQ ID NO 



404 SEQ ID NO 



437SEQ ID NO 



463SEQ ID NO 



469SEQ ID NO 



479 SEQ ID NO 



676 SEQ ID NO 



677 SEQ ID NO 



678 SEQ ID NO 



734 SEQ ID NO 



737SEQ ID NO 



770sEQ ID NO 



795SEQ ID NO 



962SEQ ID NO 



204 



10383£Q ID NO 



1039 SEQ ID NO 



1040 SEQ ID NO 



1104 SEQ ID NO 



1117SEQIDN0 



1131 SEQ ID NO 



1147SEQID NO 



1234SEQ ID NO 



1285SEQ ID NO 



1350 SEQ ID NO 



1577 SEQ ID NO 



1638 SEQ ID NO 



1660 SEQ ID NO 



1704 SEQ ID NO 



1757 SEQ ID NO 



1913SEQ ID NO 



2045 SEQ ID NO 
9 SEQ ID NO 

OS 



212 



21 34 SEQ ID NO 



2254 SEQ ID NO 



2373 SEQ ID NO 



2493 SEQ ID NO 



2585 SEQ ID NO 



271 7 SEQ ID NO 



SEQ ID NO 



2718SEQ ID NO 



2719SEQ ID NO 



2932 SEQ ID NO 



2947 SEQ ID NO 
3420sEQ ID NO 



3593 SEQ ID NO 



3594SEQ ID NO 



61 78 gaaaaaggacggttgtcct 



61 79 agaaaaaggacggftgtcc 



6180lggtttttttttttttttt 



6181 



6182accattgagacgacgaccg 



61 63 ccccggccacgcgtcaggt 



61 84 ctgggcgcgctgacgggca 



61 85 cacagcctgtctcgtgccc 



61 86 gggtgggtagccgcccaac 



6191 



gtcctgaacccgtctgttg 



61 87ggggtgggtagccgcccaa 



6 1 88 gggggtgggtagccgccca 



61 89 ggcggggcgacaatagagg 



61 90gtcgtcggagtcgtgtgtg 



ggggcaaaggacgtccgga 



61 92 ccaagccacagtgtgcgcc 



61 93gcagcaacacgtggcatgg 



61 94 cctcacaacgggggggcac 



61 95 ccctcacaacgggggggca 



6196accctcacaacgggggggc 



61 97 tgggcatcggcacagtcct 



6198 cgtattcccagatttggga 



61 99aatcaatgctgtagcgtat 



6201 



6200 gccgccacttgcggcaaat 



tgccaacgtgggtacaagg 



6202 cctgccgcggttaccgggt 



6203 actgcgtcggcatgtgggc 



6204 ctggtatcgctggtgcggc 



6205gggacagatcggagctcag 



6206 gcggcgagcctacgagtct 



6207 gctccgggggcgcttacga 



6208 gataacttcccctacctgg 



6209 gccgcggttaccgggtgtc 



6210 ggggtctcccccctccttg 



6211 



acgQQcgcccocattacgt 



621 2 ggccgctgtatgcacccgg 



621 3 cattatgtccaaatggcct 



6214 tccaagtggcccatctaca 



621 5 agggcaggggtggcgactc 



6216 cccaccttatgggcaagga 



6217 ggcgtccacagtcaaggct 



621 8 atagaagaagcctgccgcc 



621 Scatagaagaagcctgccgc 



6221 



62Z 



6220 ccalagaagaagcctgccg 



ggggggacggcatcatgca 



latcgatgaacgggg 



6223^ggaggGcgcaagccagccc 



6224atggtaccgaccctaacat 



6229catggtaccgaccctaaca 



7341 



7360 



7340 



7359 



9443 



9462 



4100 



4119 



4754 



4773 



1267 



1286 



3164 



3183 



9296 



9315 



5783 



5802 



5782 



5801 



5781 



5800 



3774 



3793 



6020 



6039 



7922 



7941 



5110 



5129 



6498 



6517 



1495 



1514 



1494 



1513 



1493 



1512 



4323 



4342 



8092 



8111 



4576 



4595 



9164 



9183 



6374 



6393 



6340 



6359 



6046 



6065 



5838 



5857 



2313 



2332 



860S 



8628 



4257 



4276 



5084 



5103 



6343 



6362 



6919 



6938 



4202 



4221 



3886 



3905 



3137 



3156 



4011 



4030 



3400 



3419 



8861 



8880 



7834 



7853 



788S 



7884 



7864 



7883 



7863 



7882 



6402 



6421 



9376 



9395 



8066 



808S 



4158 



4177 



4157 



4176 



5 
5 
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SEQIDNO: 5680c 






Of ih; 


SEQ ID NO: 
3EQ ID NO: 
5EQ ID NO: 
SEQ ID NO: 
BEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 

SEQ ID NO 
ccn in NO 

SEQ ID NO 

SEQ ID NO 
epn in NO 

QPO in NO 

cpn in NO 

SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 


6226lt 


acacaatactcgtgaacg 


8543 


8562 




5 


SEQIDNO: 5681 J 


^ ^^^^ ^^^^^^^^^ ^^^^^%f%^0^9^*t 

^caccatgcacctgtggca 


Of LW 


Or /lO; 


62271 


ac^cxiattacxxiaatcit 


6342 


6361 




5 


SEQIDNO: 5682 1 


;a(xatgcacx:tgtggcag 


OfVjO 


0794 < 
Of cf^X 


6228 


:tacx3acaattaccacicitci 


6341 


6360 




5 


SEQIDNO: 5683 ( 


jgc^atcggcacagtcctgg 






6229 ( 


:}caaaattacc^atttQcc 


4979 


4998 


-J 


5 


SEQIDNO: 5684? 


aagcggagacggctggagc 




^'%R6< 


6230 ( 


3Ctcccccx:aacactactt 


5804 


5823 




5 


SEQIDNO: 5686 ( 


sgagcgcggcttgtc^tgc 




H-OOU| 


62311 


icacaacaaccQCXX^ctcc 


7443 


7462 




6 


SEQIDNO: 5686i 


;gaagccatcaagggggga 


4489 


4508 


D^O<cl 


icccu way by oiy u uuy 


5806 

w www 


5825 




5 


SEQIDNO: 56871 
SEQIDNO: 5688 1 


ggaagtgtctcatacggc 


5165 


CA OA 

5184 


X)£.O0 


J vWy y cl I Ici ua a <■ V/ o LL'L' a 


722S 


7244 
1 fall 




b 


3ggtgctggtaggcggagt 


5322 


5341 


RO'kA 


3C/icgcy a LciA^uueiiA«v* 


8765 

W f WW 


8784 




5 


SEQIDNO: 5689! 
SEQIDNO: 5690 
SEQIDNO: 5691 
SEQIDNO: 5692 
SEQIDNO: 5693 
SEQIDNO: 5694 
SEQIDNO: 5695 
SEQIDNO: 5696 
SEQIDNO: 5697 
SEQIDNO: 5698 
SEQIDNO: 5699 
SEQIDNO: 5700 

SEQIDNO: ^^^"^ 
SEQIDNO: 5702 
SEQIDNO: 5703 
SEQIDNO: 5704 
SEQIDNO: 5705 
SEQIDNO: 5706 
SEQIDNO: 5707 
SEQIDNO: 5708 
SEQIDNO: 5709 
SEQIDNO: 5710 
SEQIDNO: 5711 
SEQIDNO: 5712 
SEQIDNO: 5713 
SEQIDNO: 5714 
SEQIDNO: 5716 
SEQIDNO: 5716 
SEQIDNO: 5717 

SEQ ID NO: 
SEQIDNO: 5718 
SEQIDNO: 572C 
SEQIDNO: 5721 
SEQIDNO: 5722 
SEQIDNO: 5723 
SEQIDNO: 5724 
SEQIDNO: 572£ 
SEQIDNO: 572€ 
SEQIDNO: 5727 


3tgggtaggatc:atcttgt 


5390 


5409 


D<CO0 


acaavJaiyyicidcyt^uau 


7713 


7732 




5 


^gccgagcaattcaagcag 


5516 


5534 




^igcac^cciiGuui^yuy 




6569 


— ^ 


b 


tggagtccaagtggcgagc 


5592 


5611 




jciCATic^iac^g aiiuuci 


81 75 


8194 

\J 1 w~ 


— ^ 


5 

w 


LQgcgagcuiggagacui 




5622 


6238 


aqqtgccctgatcacgcca 


7633 


7652 




5 


jcxcgcrcacTC^ccA/agaa 




^758 


6239 


[tctQOcqqQc^tatggggc 


5895 


5914 


1 


5 


.gagtg aciic>aa g acxjig 




6325 


6240 


caaoctataaaatcgctca 


8363 


8382 




5 


aigica a aaa c;g g I iccai 




6475 


6241 


atgatacxgacxctaacat 


4158 


4177 


i 


5 


[X«gaaaalA^iyCKayOclclUcl 




6507 


6242 


tgttccrtcc^aatgtgtcgg 


8708 


8727 




5 


3 g uycCacKaUiciiii/Ueioy 


G565 


6584 


6243 


cttgaaagcctctgccgcc 


8500 


8519 




5 


gccC/ icciug dg y g<-»y atfCi 


6967 


6986 


6244 


tgtctcctacttgaagggc 


3814 


3833 


i 


5 


CaCA#cgcy ly y ay luyy ay 


7078 


7097 


: 6245 


ctcxggtggtacacgggtg 


7278 


7297 




5 


ggagggggaigdgdfciiycJci 




7157 


: 6246 


ttc;atgctgtg(x;tactcc 


9326 


9345 


1 


5 


9^g g ^^^'^^^ gu 


7202 


7221 


. 6247 


ccc^gggggggagggocgc 


9150 


9169 


1 


5 


ttgccacctgtcaaggccc 


7301 


7320 


MASK 


jy y uv/y vUdoi ly i/y y a 


9162 

W 1 w^ 


9181 


— ^ 


5 


cccxxcxttgagggggagc 


7520 


7539 




gcACx;c;yycciayiiyyyy 


645 

w^rw 


664 




5 


■ a i 111 

ctgctgctcaatgtcctac 


7606 


7625 




jiaggaciyyoayyyyuciy 


4809 


4828 




5 


catggacaggtgccctgat 


7626 


7645 


\ DZD1 


aica iiy a a cy CI (.^i(/v«a Ly 


8QQ6 


9015 

WW 1 W 




5 


atggacaggtgcxctgatc 


7627 


7646 




jaiC^aUgaauyaC/lUbaL 


Uwww 


9014 

WW I 




5 


ggctatgactaggtactcc 


8635 


8654 


\ QZOO 


n o /* o o f rt d a o rt P f* 

ygagcaaciiyaaaaoyuu 




8939 

Uwww 




5 


caccatagatcactcccct 


27 


A A 

46 




agggccuggcacaiggig 


785 

f Ow 


804 

Ww^ 


2 


4 


agciguCaiA«u\#ic«ycc 




1225 


' 6255 


qqcgtqctgacgactagct 


8459 


8478 


2 


4 


cigc^aigacicccu/wag 


1R24 


1643 


• 6256 


ctaatqcagctgttggcag 


5847 


5866 


2 


4 


atgiggggggcgiggagca 




2257 


6257 


tqctqcacc;atcac;aa(^t 


7701 


7720 


2 


4 


iggggacaic^aicciyyyu 




3341 


• 6258 


qcccaactcgctcccccca 


5796 


5814 


2 


4 


ri « n »a o o t r> d f r» n n O t 

jggacaiLtdiLrOLyyyuuL 


'^24 


3343 


: 6259 


aqqcaggagata acttccc 

*^ M ^^^^ C ^^^^ 


5076 


5095 


2 


4 


3ggagaiacicciggggvc 




338S 


: 6260 


qqcccctgcacgcjcttccc 


6545 


6564 


2 


4 


a ig ug gacigiCLaccai 




3593 


• 6261 


atqctctacgcc^cgac^t 


7718 


7737 


2 


4 


[#Gcig(«CUaCCalUal<C#Ua 


B16<3 


6208 


• 6262 


tqqqtac^aagggagtctgg 


6382 


6401 


2 


4 


yiAAncxTiigayygcgaca 


6Q67 


6Q86 


' 6263 


tatccc;aqggggggagggc 


9147 


9166 


2 


4 


ccagcccccgattgggggc 


1 


20 


. 6264 


gccxgagggcagggcx^gg 


55G 


56S 


• 


4 


acc:atagatcactcccx)tg 


28 


47 


: 6266 


cagggcxsttggcacatggt 


784 


803 




4 


atgagtgtcgtgcagcctc 


9£ 


114 


: 6266 


gaggccgcgatg(x;atcat 


2946 


2965 




4 


gtgcagcctccaggacccc 


104 


123 


: 6267 


gggggacggcatcatg(»c 


6402 


6422 




4 


1 tgcagcxtccaggaccccc 


10£ 


^2A 


: 6268 


gggggoacggcatcatgc^ 


6402 


6421 




4 


i ccaggacccxxxxrtcccgg 


112 


132 


^SEQ ID NO 
SEQ ID NO 
SEQ ID NO 


: 6266 


ccggctggttogttgctgg 


9256 


9274 


' 


4 


^acx:(xccctcccgggagag 


11E 


137 


: 627C 


ctctcatgccaacgtgggt 


636€ 


6387 


1 


4 


) ccccctcccgggagagcc^a 


121 


14C 


1: 6271 


tggcaatgagggcatgggg 


59£ 


617 


' 1 


4 


i agactgctagccgagtagt 


242 


262 


SEQ ID NO 


>: 6272 


! actatgcggtccxcggtct 


3952 


3972 


1 


4 


'agccgagtagtgttgggtc 


251 


27C 


I SEQ ID NO 


6272 


; gaccaggatctcgtcggct 


365€ 


367£ 


1 


4 



321 



wo 2004/080406 



PCT/US2004/007070 



SEQIDNO: 572^ 
SEQIDNO: 5725 
SEQIDNO: 573( 
SEQIDNO: 5731 
SEQIDNO: 5735 
SEQIDNO: 573^ 
SEQIDNO: 573^ 
SEQIDNO: 6735 
SEQIDNO: 5736 
SEQIDNO: 5737 

SEQIDNO: ^^^^ 
SEQIDNO: 573£ 
SEQIDNO: 574C 
SEQIDNO: 5741 
SEQIDNO: 5742 
SEQIDNO: 5743 
SEQIDNO: 6744 
SEQIDNO: 5745 
SEQIDNO: ^746 
SEQIDNO: 5747 
SEQIDNO: 5748 
SEQIDNO; 5749 
SEQIDNO: 5750 
SEQIDNO: 5751 
SEQIDNO: 5752 
SEQIDNO: 5753 
SEQ ID NO: 5754 
SEQIDNO: 5755 
SEQ ID NO- 5756 
SEQ ID NO* 5757 
SEQ ID NO* 5758 
SEQIDNO: 5759 
SEQIDNO: 5760 
SEQIDNO: 5761 
SEQIDNO: 5762 
SEQIDNO: 5763 
SEQIDNO: 5764 


3 pgtgcttqcqaqtqccccq 


291 


5 31< 


iqpn in Mr 




1 cggggccttggttgacacc 


21 3i 




0 

0 


>1 A 
1 ^ 


^ gcxiagtgccccgggaggtc 


30( 


32i 


3 SEQ ID NC 


): 627( 


I gacxxjccggcgtaggtcgc 


67- 


1 69( 


D 


1 4 


) acxgtgcaccatgagcacg 


33' 


35( 


3 SEQ ID NC 


>: 627e 


) cgtgcaatacctgtacggt 


243* 


J 245( 


3 


1 4 


1 cccgggcggtggtcagatc 


4i: 


43' 


1 SEQ ID NC 
)SEQ ID NC 


): 627 ^ 


^ gatcatgcatactcccggg 


991 


I lOlf 




1 4 


igcxjgcgcaggggccccagg 


45- 


47C 


>: 62 n 


I cctgcacgccttccccggc 


654< 


3 656^ 


3 ' 


1 4 


i acxx^cgtggaaggcgacag 


5V 


53( 


)SEQ ID NC 


K 62/i 


Sctgtatgcacccggggggt 


389' 


I 391( 




1 4 


I Dcocgtggaaggcgacagc 


612 


531 


1 SEQ ID NO 
^SEQ ID NC 


K 628C 


) gctgtatgcacccgggggg 389C 


) 3901 




I 4 


agcctatccccaaggctcg 


521 


547 


K 6281 


c^gagggcagggcctgggct 


55: 


J 572 


1 


1 4 


ctatccccaaggctcgccg 


531 


55C 


)SEQ ID NO 


i: 6282 


■ cggctgtcgttcccgatag 


541 e 


3 5437 


^ 1 


1 4 


' tatccccaaggclcgccgg 


532 


551 


1 SEQ ID NO 


>: 6283 


Iccggctgtcgttcccgata 


6417 


' 5436 


1 


1 4 


( cgggtatccttggcccctc 


577 


' 59e 


'SEQ ID NO 


. 6284 


1 gaggccgcaagcxagcccg 


8067 


' 808e 




1 4 


\ gcatggggtgggcaggatg 


60£ 


62£ 


^SEQ ID NO 
>SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 

SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
3EQ ID NO: 
SEQ ID NO: 
3EQ ID NO: 

3EQ ID NO: 

SEQ ID NO: 
>EQ ID NO: 
>EQ ID NO: 
>EQ ID NO: 
>EQ ID NO: 
>EQ ID NO: 


* 628S 


cateaateccctcflRfl tnr 








1 4 


1 tcctgtcaccccjgcggctc 


63C 


64£ 


• 6285 


gagctqcaaaqctccaaaa 


8522 






>1 
*t 


gggccccacggaccxjccgg 


661 


68C 


: 6287 


ccggccqcatatqcqqccc 


4064 








ggccccacggacccccggc 


662 


681 


: 6288 


gccggccgcatatacQpcc 


4063 




\ 


A 


cggcctcgccgacctcatg 


724 


743 


: 6289 


catqaggatcaicqqqccq, 


6472 


64Q1 






ggcxtcgccgacctcatgg 


725 


744 


: 6290 


ccatqaqqatcatcqqqca 


6471 


64gc 


1 1 




ggccccctagggggcgctg 


764 


783 


: 6291 


cagctccgaattgtcggcc 


7414 


7433 


1 < 
1 


4 


tggcacatggtgtcogggt 


792 


811 


: 6292 


acccacgctqcacaqacca 


5188 


5207 


1 f 


4 


cttcctcttggctctgctg 


868 


887 


: 6293 




OOOO 


5882 


1 < 


A 
t 


catgtcacgaacgactgct 


944 


963 


: 6294 


anr!fl nf n p tpa r* ttr* r* s» trr 
wci ^ ty 1/ li/d t> 1 1 u u CI 


DO*f A 


6866 




4 


gaggcggcggacttgatca 


983 


1002 


: 6295 


lyaiggCaliuclCaguCiC 


0/1^ 


5731 




4 

H 


catccccactacgacaata 


1096 


1115 


• 6296 


Id I iduiyy y y y lu uy dig 




4611 




A 
H 


gctgttcaccttctcgccc 


1207 


122g 


: 6297 


jyyi'iyi/yiyggdddCagc 


Of aO 


8812 




A 


gccccgccggcatgcgaca 


1222 


1241 


6298 


gtctcctacttqaagaac 

* — — """J ■*57?JO 


3814 


3833 




A 
H 


tgdcctgggacatgatgat 


1293 


1312 


6299 


atcaatttgdccctgcca 


5981 


6000 




A 
** 


cacaagccgtcatcgacat 


1362 


1381 


6300 


atgtttgggactgggtgtg 


6279 


6298 




A 


agccgtcatcgacatggtg 


1366 


1385 


6301 


caccaagcaggcggaggd 


5560 


5579 




A 
*t 


Sgtggcgggggcccactgg 


1381 


1400 


6302 


ccagggctcaggccccacc 


5127 


5146 




« 


C7 W W W W ^''^'^^ " W o W W W 


1387 


140fi 


DOUO 


qactagqtactccaccccc 


8641 


866q 




4 


ataacaaaaaactaaacta 


1430 


14J.Q 

l*t*Tw 


DOU*f 


tagcagtgctcacttccat 


6846 


686S 




4 


Hgattgtgatgctacttt 


1454 


1473 


6305 


aaagcaagctgcccatcaa 


7665 


7684^ 




> 


caacgggggggc£\cgctgc 


1500 


1519 


6306 


gcagaaggcgctcgggttg 


5530 


5549 1 


4 


acgctgcccgcctcaccag 


1512 


1531 


5307 


c^tggacccgaggagagcgt 


2278 


2297 




4 


icagagaatccagcttata 


1564 


1583 


630Q 


iatatcgggggtccoctga 


8393 


8412 1 


4 


accaatggcagttggcaca 


1586 


1605. 


63091 


gtggctcggggccttggt 


2132 


2151 1 


4 


:caatggcagttggcacat 


1587 


1606- 


631 Oi 


3tgtggctcggggccttgg 


2131 


2150 1 


4 


SEQIDNO: 57651 


]tcctatcacttatgctga 


1749 


17681 


63111 


caggactggggtaaggac 


4176 


4195 1 


4 


SEQIDNO: 5766c 


;tgagcctacaaaagaccx: 


1764 


17831 


631 2 ( 


;iggtggcttcatgcctcag 


9063 


9082 




4 


SEQIDNO: 5767( 


^aggtgtgtggtccagtgt 


1844 


18635 


6313 


3cactccagttaactcctg 


8817 


8836^ 




4 


SEQIDNO: ^7681 


gtggtccagtgtattgct 


1850 


1869< 


6314c 


^Qc^sggccatcaaccaca 


7949 


7968 




4 


SEQIDNO: 5769s 


;icttcaccccaagtcctgt 


1866 


1885( 


6315 c 


icagcagaggcggctaagc 


6887 


6903 




4 


SEQ ID NO: 577actgttgtcgtggggacaac 


1881 


19005 


631 6 £ 


jttgcaacttggacgacag 


2295 


2314 




4 


SEQIDNO: 5771: 


jccgccgcaaggcaadgg 


1972 


1991 < 


6317c 


^agttggacttatccggc 


9241 


9260 




4 


SEQIDNO: 5772c 


igcaactggttcggctgta 


1982 


2001 < 


631 8, t 


acacgggtgcccattgcc 


7287 


7306 




4 


SEQIDNO: 5773£ 


jcaactggttcggctgtac 


1983 


2002 c 


6319s 


jtacacgggtgcccattgc 


7286 


7305 




4 


SEQIDNO: 5774c 


^cccgtgtaacatcggggg 


2043 


2062 s 


6320|ccccaatcgatgaacgggg 


9376 


9395 




4 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



5775bgactgcttccggaagcac 



5776 gactgcttccggaagcacc 



5777 tccggaagcaccccgaggc 

5778 actcaaaatgtggctcggg 

5779 ggccttggttgacacctag 



5780 aggagagcgttgcaacttg 



5781 ggacagatcggagctcagc 



5782 cagatcggagctcagcccg 



5783 ggagctcagcccgctgctg 



5784 caccctaccggctctgtcc 



5786 cggctctgtccactggctt 



5786 ccatcagaacatcgtggac 



5787 ggtcagcggttgtctcctt 



5788 gccgccttagagaacctgg 



5789 gccttagagaacctggtgg 



5790 gccggagcgcacggcatcc 

5791 gctgcatcgtgcggaggcg 



5792 attattgaccttgtcgcca 



5793 Icgccatattacaaggtgt 



5794 cgccatattacaaggtgtt 



5795gtccggggaggccgcgatg 



5796 tcaccccactgcgggattg 



5797 ttgggcccacgccggccta 



5798 ctacgggaccttgcggtag 



5799 cctgtcgtcttctctgaca 



5800 ctgtcgtcttctctgacat 



5801 cx:tggggggcagacaccgc 



6802gggggcagacaccgcggcg 



5803 ggcgtgtggggacatcatc 



5804 tggggccggccgatagtct 



6805gaaccaggtcgagggggag 



5806gagggggaggttcaagtgg 



5807 agg cccaatcgcccagatg 



5808 ggcccaatcgcccagatgt 



5809 caggatctcgtcggctggc 



581 O aggatctcgtcggctggcc 
581 1 gccccccggggcgcgttcc 



6812gcacctgtggcagctcgga 



581 3 ctgtggcagctcggacctt 



581 4 gcggggcgacaatagaggg 



581 5 ggagcttgctctcccccag 

581 6 gagcttgctctcccccagg 

581 7 acttgaagggctcttcggg 

581 8 tgtccccgttgagtccatg 



58 1 9 gaaactactatgcggtccc 



5820 aaactactatgcggtcccc 



5821 ctcccactggcagcggcaa 



5822 ggcgtatatgtctaaagca 



2092 



2093 



2100 



2124 



2142 



2287 



2314 



2317 



2323 



2383 



2391 



2419 



2460 



2579 



2582 



2621 



2786 



2824 



2837 



2838 



2939 



2111SEQ 



2112SEQ 



2119SEQ 



21 43 SEQ 



21 61 SEQ 



2306 SEQ 



2333SEQ 



2336 SEQ 



2342 SEQ 



2402SEQ 



2410SEQ 



2438 SEQ 



2479 SEQ 



2598 SEQ 



2601 SEQ 



2640SEQ 



2805SEQ 



2843 SEQ 



2856 SEQ 



2857 SEQ 



2958 SEQ 



3201 



3217 



3233 



3260 



3261 



3297 



3301 



3316 3335 SEQ 



3378 



3499 



3509 



3625 



3626 



3659 



3660 



3682 



3711 



3716 



3775 



3792 



3793 



3822 



3928 



3948 



4032 



4138 



3220 SEQ 



3236 SEQ 



3252 SEQ 



3279 SEQ 



3280 SEQ 



331 6 SEQ 



351 8 SEQ 



3645 SEQ 



3678 SEQ 



3734SEQ 



3947 3966 SEQ 



4051 SEQ 



4 157 SEQ 



3320 SEQ 



3397SEQ 



3528 SEQ 

3644SEQ 



3679SEQ 



3701 SEQ 



373QSEQ 



3794 SEQ 



3811 SEQ 



381 2 SEQ 



3841 SEQ 



3947 SEQ 



3967SEQ 



ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 



6321 


otactaataaacQoaatcc 


5324 


5343 




4 


6322 


ggtgctqqtaggcggagtc 


5323 


5342 




4 


6323 


Qcctacaagtcttcacgga 


6616 


8635 




4 


6324 


cccQaacaQCpagtcQaqt 


8201 


8220 




4 


6325 


ctaaccaacccaaaaciQCC 


3611 


3630 




4 


6326 


caaaccataatci aactcct 


8162 


8181 




4 


6327 


QctaoQaatcattatatcc 


3128 


3147 




4 


6328 


cgggtggcccactgctctg 


3837 


3856 




4 


6329 


cagctgctgaagaggctcc 


6206 


6225 




4 


6330 


pgactgggtgtgcacggtg 


6286 


6305 




4 


6331 


aagcaggcggaggctgccg 


5564 


5583 




4 


6332 


atccccattaaatccataa 


392d 


3948 




4 


6333 


aaaaataattctaataacc 


8875 


8894 




4 


6334 


ccaattaaacttatccaac 


9241 


9260 




4 


6335 


ccaccaaacaaQcqqaqqc 


5559 


5578 




4 


6336 


aaattaaacccacoGCQQC 


3214 


3233 


1 


4 


6337 


cQccacdacatcccQcaQC 


7726 


7745 




4 




(■y y CI i^d ^ d U^i^ i^id CI I 


4B47 


4RBR 


-7 


4 




d vd l/dd iv I iWA^ ly ^7 




3558 


T 


.4 




a a pa paatnHtpptn n pn 


w wwiJ 




T 


4 


6341 


cstcQQcacacitcctaaac 


4327 


4346 




4 


6342 


caatttaccaatatfotaa 

i/uHiitbiwwbiwii ly iiy »y vm 


8326 


8344 

W W 1 » 




4 


6343 


lay y ^ loy y y y vv/y I wwd 


5221 


5240 




4 


6344 


rtartcctactttctataa 


9338 

W W WVJ 


9357 

www f 




4 


6345 


tatcciacacataaacaaa 


7617 


7636 


1 


4 


6346 


atatcctacacataoacaa 


7616 


7636 




4 


6347 


□caaaataaaactaacaaa 
y yyyy 


4804 


4823 




4 


6348 


cacccaactcacteccccc 

vy wO a V * V M< VWW 


5794 


5813 




4 


6349 




3755 

w f WW 


3774 

wl f ~ 




4 


6350 


agacgacgaccgtgcccca 


4761 


4780 




4 


6351 


ctccacctatggcaagttc 


4222 


4241 




4 


6352 


ccacctgtcaaggcx:cctc 


7304 


. 7323 




4 


6353 


catcccgcagcgcgggcct 

C/ WW www 


7734 


7753 




4 


6354 


acatcccgcagcgcgggcc 


7733 


7752 




4 


6355 


pccaataQQccatttcctg 


9410 


9429 




4 


6356 


ggccaataggccatttcct 


9409 


9428 




4 


6357 


ggaacctatccagcagggc 


7938 


7957 




4 


6358 


tccggtggtacacgggtgc 


7279 


7298 




4 


6359 


aaggcaaaggcgtccacag 


7826 


7845 




4 


6360 


ccctgcctgggaaccccgc 


5682 


5701 




4 


6361 


ctggttgggtcacagctcc 


6806 


6825 




4 


6362 


cctggttgggtcacagctc 


6805 


6824 




4 


6363 


cccgtggtggagtccaagt 


5585 


5604 




4 


6364 


catggtctacgccacgaca 


7717 


7736 




4 


6365 


gggaaggcacctcattttc 


4504 


4523 




4 


6366 


ggggogcatatacaggtlt 


4828 


4847 




4 


6367 


ttgccaggaccatctggag 


4993 


5012 




4 


6368 


tgctcgccaccgctacgcc 


4377 


4396 




4 
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SEQ ID NO; 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



5823gcgtatatgtctaaagcac 



5824tggggtaaggaccattacc 



5825 accattaccacgggcgccc 



5826 cgtactccacctatggcaa 



5827 cagtcctggaccaagcgga 



5828!aggggggaaggcacctcat 



4139 



4183 



4193 



4218 



4335 



4500 



5829 €actccaagaagaagtgcg 
583 datcaatgctgtagcgtatt 
5631 tataccgaccagcggagac 



5832aggactggcaggggcaggQ 



5833|gggaacggccctcgggcat 



5834 cgggcatgttcgattcctc 



5835tggtacgagctcacccccg 



5836gggcttacctaaatacacc 



5837ggcttacctaaatacacca 



5838 gagataacttcccctacct 



5839cccacctccatcgtgggat 



584Qcatggcatgcatgtcggcc 



5841 ggccgacctggaagtcgtc 



5842gccgacctggaagtcgtca 



5843|tggaagtcgtcaccagcac 



5844gcacctgggtgctggtagg 



5845ggttatcgtgggtaggatc 



5846cccgaiagggaagtcctct 



5847 tgaaatggaagaatgcgcc 



5848 ccaagtggcgagctttgga 



5849ncatcagcgggatacagt 



5850 agcgggcttatccaccctg 

5851 ccagcccgctcaccaccca 



6852 gtgggcgctggtatcgctg 



5853ggaaggtgctagtggacat 



5854 ggtcatgagcggcgaggcg 
5855'catgtgggcccgggagagg 



5856atgtgggc€xgggagaggg 



5857 ggggccgtgcagtggatga 



5858,gcgttcgcttcgcggggta 



5859 ggggtaaccatgtctcccc 



5860catcacccagctgctgaag 



5861 aggactgttctacgccgtg 



5862|ttcaagacctggctccagt 
586: 
586' 



3ct 
4ci 



cctgccgcggttaccgg 



5865 ggaggtcacgcgggtgggg 



5866 gaggtcacgcgggtggggg 



accacgggcccctgcacg 



5867atgtcaggttccagctcct 



5868 atgaaatatccattgcggc 



5869 ctccattgttagagtcttg 



5870 tgcccattgccacctgtca 



4526 



4577 



4618 



4811 



4857 



4869 



4922 



4962 



4963 



5082 



5140 



5278 



5293 



5294 



5301 



5316 



5383 



5429 



5461 



5598 



5645 



5668 



5736 



5831 



5877 



5944 



6056 



6057 



6074 



6104 



6117 



6199 



6240 



6314 



6338 



6538 



6616 



6617 



6682 



7152 



7239 



7295 



4158|SEQ ID NO 



4202 SEQ ID NO 



421 2 SEQ ID NO 



4237 SEQ ID NO 



4354 SEQ ID NO 



451 9 SEQ ID NO 



4545 SEQ ID NO 



4596 SEQ ID NO 



4637 SEQ ID NO 



4830SEQ ID NO 



4876SEQ ID NO 



4888 iQ f,jo 

4941 SEQ ID NO 
4981 SEQ ID NO 



4982 SEQ ID NO 
5101 SEQ ID NO 



5159SEQ ID NO 



5297 SEQ ID NO 



531 2 SEQ ID NO 



531 ^SEQ ID NO 



5320 SEQ ID NO 



5335 SEQ ID. NO 



5402SEQ ID NO 



5448SEQ ID NO 



5480 SEQ ID NO 



561 7 SEQ ID NO 



5664SEQ ID NO 



5687SEQ ID NO 



5755SEQ ID NO 



5850SEQ ID NO 



5896SEQ ID NO 



5963 SEQ ID NO 



6075 SEQ ID NO 



6076SEQ ID NO 



6093 SEQ ID NO 



ei23SEQ ID NO 



^^^^SEQ ID NO 



62 18 SEQ ID NO 



6259 SEQ ID NO 



6333 SEQ ID NO 



6357 SEQ ID NO 



6557 SEQ ID NO 



6635SEQ ID NO 



6636SEQ ID NO 
6701 SEQ ID NO 
7171 SEQ ID NO 



7258SEQ ID NO 



7314|SEQ ID NO 



6369|gtgctcgccaccgctacgc 



6370ggtaaccatgtctccccca 



6371 gggcgctggtatcgctggl 



6372 ttgccccaaccagaatacg 



6373 tccgtgagccgcatgactg 



6374 atgagcggcgaggcgccct 



6375 cgcatgactgcagagagtg 



6376 aatacgacttggagttgat 



6377 gtctcccccacgcactatg 



6378 ccctgccatcctctctcct 



6379 atgctcaccgacccctccc 



6380 gaggccgcaagccagcccg 



6381 



6382 ggtggctccatcttagccc 



6383lggtggctccatcttagcc 



6384 aggttggccagggggtctc 



cggggacttgccccaacca 



6385 atccaagtftggctatggg 



6386 ggcctctctgcagatcatg 



6387 gacgcccccacattcggcc 



6388 tgacgcccccacattcggc 



4376 



4395 



611 



6138 



5833 



5852 



8669 



8688 



9560 



9579 



5948 



5967 



9569 



9588 



8682 



8701 



6128 



614- 



5992 



6011 



6863 



688: 



8067 



8081 



8662 



8681 



9518 



953: 



9517 



953< 



6908 



692' 



7906 



7921 



9596 



961 



7885 



79( 



7884 



790a. 



6389 gtgcccatgtcaggtt£x» 



6390 cctacacatggacaggtgc 



6391 



6392 agagcggctttatatcggg 



6393 ggcgcgctcgtggccttca 



6394 tocattgttagagtcttgg 



gatcatcgggccgaaaacc 



6395 actgcacgatgctcgtgaa 



6396 caggggtggctggcgcgct 



6397 tgggcgctggtatcgctgg 



6398 cagcagggccatcaacxac 



6399 atgtggtctccacocttcc 



6400 cgcccctcctgaccagacc 



6401 cctccttgagggcgacatg 



6402 ccctccttgagggcgacat 



6403 tcatgctcctctatgcccc 



6404 taccaccacgagcttacgc 



6405 gggggagccgggggacccc 



6406 cltcgagcggagggggatg 



6411 



6407 cacggcgaccgcccxjtcct 



6408 actgcacgatgctcgtgaa 



6409 ccgggacgtgcttaaggag 



641 0 cgtggaggtcacgcgggtg 



cccctccaataccacctcc 



641 2 cccctcctgaccagacctc 



641 3 aggagatgggcggaaacat 



641 4 gccgtgatgggctcctcat 



641 5 caagtggcgagctttggag 



641 6 tgactaattcaaaagggca 



6676 



669j 



7620 



7639 



6478 



649^ 



8383 



840: 



5924 



5943 



7240 



7259 



8541 



8560 



5913 



5932 



5832 



5851 



7948 



7967 



8142 



8161 



7453 



747: 



6969 



6988 



6968 



698* 



7505 



7524 



2751 



2770 



7531 



7560 



7130 



7149 



7444 



7463 



8541 



8560 



7804 



7823 



6613 



6632 



7317 



7336 



7455 



747^ 



7059 



7078 



8165 



8184 



5599 



5618 



8409 



8428 
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SEQ ID NO 



587l|accacctccacggagaaaa | 7327 



5$72ccacctccacggagaaaaa | 732g 



5873acctccacggagaaaaagg I 7330 



5874 ggttgtcctgacggactcc I 7351 



5875cctgaccagacctccgaca [ 7460 



5876 agcaagctgcccatcaacg | 7667 



5877ggatgaccattacogggac | 7792 



5878tggcaaagaatgaggtttt 8028 



5879 ggcaaagaatgaggttttc | 8029 



5880gggcagcgggtcgagttcc I 8204 

5881 gactagctgcggtaata^ 8470 



5882ctcgcgatcccaccacccc [ 8766 



5883 aggatgattctgatgacoc I 8876 



5884 agccacttgacctacctca | 8976 



5885gggtacx:gcccttgcgagt | 9090 



5886 ctgcaatgactccctccag I 1624 



5887 ccagcccccgattgggggc 



5888aaggcgacagcctatcccc | 52 



5889ggccccacggacccccggc | 662 



5890 gaggcggcggacttgatca I 983 
5891 ctgcaattgttcgatctac I 1249 



5892 ctccagactgggtttcttg I 1637 



5893 tcgtacctgcgtcgcaggt 1830 



5894caagacgtgcggggccccc 2026 



^^^5 aatgctgcatgcaactgga I 2264 



5896 caccctaccggctctgtcc I 2383 



7346|SEQ ID NO 



7347feEQ ID NO 



7349feEQ ID NO 



7370feEQ ID NO 



7479feEQ ID NO 



7686bEQ ID NO 



7811feEQ ID NO 



8047feEQ ID NO 



804fflSEQ ID NO 



^223SEQ ID NO 



8489SEQ ID NO 



878aSEQ ID NO 



SBSaSEQ ID NO 



5897 cgccatattacaaggtgtt I 2838 



5898cgaagccatcaagggggga| 4489 



5899ccagcccgctcaccaocca | 5736 



590Q ggctatgactaggtactcc 
5901 ctccaccatagatcactcc 

5902ltccaccatagatcactccc 



8635 



24 



25 



5903 caccatagatcactcccct 



27 



5904 tcactcccctgtgaggaac 



36 



5905 cgttagtatgagtgtcgtg 
5906 



88 



gtcgtgcagcctccagga 100 



5907ccccccctcccgggagagc | 119 



5908 ggagagccatagtggtctg | 131 



5909 gagccatagtggtctgcgg I 134 



591 0 gtggtctgcggaaccggtg I 1 42 1 

591 1 agtacaccggaattgccag I 161 



591 2 ggtcctttcttggatcaac 

591 3 ttcttggatcaacccgctc 



188 



591 4 ctcaatgcctggagatttg 210 



194 



59 1 5 tgcctggagatttgggcgt 215 



591 6 gcctggagatttgggcgtg I 216 



591 7 gagatttgggcgtgccccc 221 



8995SEQ ID NO 



9109SEQ ID NO 



1643 SEQ ID NO 



20SEQ ID NO 



539 bEQ ID NO 
68lb EQ ID NO 
IQOa SEQ ID NO 
1268 SEQ ID NO 
1656 bEQ ID NO 
1849 teEQ ID NO 
2M9SEQ ID NO 
228|SEQ ID NO 
2402|sEQ ID NO 
lEQ ID NO 



2857 



450qSEQ ID NO 



575g sEQ ID NO 
8654 SEQ ID NO 

^^ EQ ID NO 
4jsEQ ID NO 
4g SEQ ID NO 
5a sEQ ID NO 

^QTp EQ ID NO 
119|SEQ ID NO 

;eq id no 



150SEQ ID NO 



153SEQ ID NO 



161teEQ ID NO 



180bEQ ID NO 



207teEQ ID NO 



213teEQ ID NO 



229^EQ ID NO 



234bEQ ID NO 



235kEQ ID NO 



240bEQ ID NO 



: 641^ 


tottccctctttatggt 


95o: 


?| 952' 


1 


4 


: 641 f 


^ttttccctctttatggtgg 


950' 


\ 952: 


3 ' 


4 


: 641 c 


cctttgacagactgcaggt 


777( 


y 77Bi 


J ' 


4 


: 642C 


ggagctcgctaccaaaacc 


739( 


) 7405 


) ' 


4 


: 8421 


t tgtcctacacatggacagg 


7611 


r 763( 




4 


: 6422 


I cgttgagcaactctttgct 


TQBi 


5 770f 


> 1 


4 


: 642c 


gtcccagttggacttatcc 


923e 


925*? 


r 


4 


: 6424 


[ aaaaagccctggattgcca 


8931 


1 895C 
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6035ggggacatcatcctgggcc 



6036 acctgtdccgcccgaagg 



6037 tgtctccg cccga agggga 



6038 gggagatactcctggggcc 



6039ctcccaacagacccggggc 



6040 tccacx:gcaacacaatctt 



6041 cacaatctttcctggcgac 
6042|ggctggccggcgccccccg 



6043 ccccggggcgcgttccctg 



6044 tccctgacaccatgcacct 



6046|ttccggtgcgccggcgggg 
604f 
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:cccccaggcctgtctcc 



6046 tttgtccccgttgagtcca 



6049 ccgtaccgcaaacattcca 



6050 caagtggcccatctacacg 

6051 cacgctcccactggcagcg 



6052 ccgcatatgcggcccaagg 



ggggttgcaaaggcggtg 



6053 cgtatatgtctaaagcaca 



6054 gtatatgtctaaagcacat 



6055 ggaccattaccacgggcgc 



6056 cccccattacgtactccac 



6057 agttccttgccgacggtgg 



6058 gagacggctggagcgcggc 
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2952 SEQ 



3027 SEQ 
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3051 SEQ 



3052 SEQ 
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3164SEQ 
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3281 SEQ 



33 18 SEQ 



3319SEQ 



3340 SEQ 



3341 SEQ 



3342 SEQ 



3362 SEQ 



3365SEQ 



3385SEQ 



3458 SEQ 



3549 SEQ 



3559 SEQ 



3690 SEQ 



3704 SEQ 



371 7 SEQ 



3781 SEQ 
'SEQ 



3821 



3923 SEQ 



3945 SEQ 



4015SEQ 



4032 SEQ 



4047 SEQ 



4087 SEQ 



4159SEQ 



41 60 SEQ 



4210 SEQ 
^SEQ 



4228 



4255 SEQ 



4352 4371 SEQ 
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ID NO 
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ID NO 

ID NO 
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ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 

ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
ID NO 
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ID NO 

ID NO 
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3 
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3 


6579 
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4020 
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6580 


gcccatctacacgctccca 


4019 
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3 
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5815 
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5937 
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3 


6599 
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5163 


5182 




3 


6600 


atgtggaagtgtctcatac 


5162 


5181 




3 


6601 


gcgcgtgtcactcaggtcc 


6167 


6186 




3 


6602 


gtgggcccgggagaggggg 


6059 


6078 




3 


6603 


ccacagtcaaggctaaact 


7839 
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3 
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7537 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



6059|caccgctacgcctcca9ga 



6060 tggagagatccccttctac 



6061 agccatcccx;atcgaagcc 



6062 tocccatcgaagccatcaa 



6063 ccccatcgaagccatcaag 



6064 ggcctcggaatcaatgctg 



6065 gtccgtcataccgaccagc 



6066 gtcataccgaccagcggag 



6067 cgggctalaccggtgactt 



6068 ctttgattcagtgatcgac 



6069 acagtcgacttcagcttgg 



6070 cttggaccccaccttcacc 



6071 gagacgacgaccgtgccoc 



6072 ggggtaggactggcagggg 



6073gggcatatacaggtttgta . 



6074 gggggaacggccctcgggc 



6075 tgacgcgggctgtgcttgg 



6076 gacgcgggctgtgcttggt 



6077 tgcttggtgcgagctcacc 



6078 tgcccacttcctgtcccag 



6079ggtggcataccaagocaca 



6080 gggctcaggccccacctcc 



4384 



4453 



4477 



4482 



4483 



4568 



4612 



4616 



4668 



4684 



4724 



4738 



4760 



4403 SEQ ID NO 
4472 SEQ ID NO 
4496 SEQ ID NO 
4501 SEQ ID NO 
4502 SEQ ID NO 
4587 SEQ ID NO 
4631 SEQ ID NO 



4635SEQ ID NO 



4687 SEQ ID NO 



4703 SEQ ID NO 



4743SEQ ID NO 



4757SEQ ID NO 



4806 4825SEQIDNO 



4831 



4855 



4906 



6081 ccatcgtgggatcaaatgt 



6082 tcatacggctaaaacccac 



6083tgctgtataggctaggggc 



6084 ccaaatacatcatggcatg 



6085 ggagtcctcgcagctctgg 
608d gcctgacaac aggcagtgt 
7a< 



608 



6088catgtggaatttcatcagc 



6089 ctctatcaccagcccgctc 



6090 cccagaacaccctcctgtt 



agccaccaagcaggcggag 



6091 ctcctgtttaacatcttgg 



6092 ttgggggggtgggtagccg 



6093tgcttcggctttcgtgggc 



6094 Icgtgggcgctggtatcgc 



6095 cgctggtgcggctgttggc 



6096 cggctgttggcagcatagg 



6097 ggggcaggggtggctggcg 



6098 ctggcgcgctcgtggcctt 



6099 tggcgcgctcgtggccttc 



61 00 gagcggcgaggcgccctct 



6101 tgggcccgggagagggggc 



61 02 cggctgalagcgttcgctt 



61 03 gtgcctgagagcgacgccg 



61 04 atgaggactgttctacgcc 



61 05 gtccaagctcctgccgcgg 



4907 



4918 



5050 



5101 



4779 SEQ ID NO 



4850SEQ ID NO 



4874 SEQ ID NO 



4925SEQ ID NO 



4926SEQ ID NO 



4937SEQ ID NO 



5069 SEQ ID NO 



512PSEQ ID NO 



5130 5149SEQIDNO 
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5214 



5268 



5336 
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5635 



5728 



5751 



5762 



5777 



5818 



51 66 SEQ ID NO 



5829 



5853 
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5922 



5923 



5950 



6060 



6095 



6146 



6237 
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6165 



51751 5194 SEQIDNO 
5233 iQ NO 
5287 SEQ ID NO 
5355 SEQ ID NO 
5383 SEQ ID NO 
5576 SEQ ID NO 
5654 SEQ ID NO 
5747 SEQ ID NO 
5770 SEQ ID NO 
6781 SEQ ID NO 



5796 SEQ ID NO 



5837 SEQ ID NO 



5848 SEQ ID NO 



5845| 5864SEQ ID NO 
5872 SEQ ID NO 
5928 SEQ ID NO 
SEQ ID NO 
5942 SEQ ID NO 
5969 SEQ ID NO 
6079 SEQ ID NO 
6114 SEQ ID NO 
SEQ ID NO 



6256SEQ ID NO 



6350ISEQ ID NO 





) luuiacacaiggacaggig 


\ 7618 


7636 








* gidy cagigcicacTlcca 


1 CO AC 

1 Do4c 


686^ 








ggciggLicgngciggci 


1 9257 


927e 








"gagggggagccggggga 


\ 7527 
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1 752o 


Ac 

754S 




3 


RRir 
OO lU 


cagciccgaangtoggcc 
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OD 1 1 
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gicgagncciggtaaaag 
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y u y y "'^^^^^'U 
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isllU 






6618 
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OOOr 
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6619 
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I 04 1 






6620 


gcccctgcacgccttcccc 


6546 


6565 
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cx^aattgacaccaccgtca { 


80od 


8028 
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6622 


accaattgacaccaccgtc | 


800a 


8027 
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6623 


9gtgcggctgttggcagca 


5849 


5868 






6624 


ctgggcgcgctgacgggca 
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8002 
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6626 
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acattctggcgggctatgg 


5892 
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3 


6628 


gtggcctlcaaggtcatga | 


5933 


5952 




3 



6629gcccgaaccggacgtagca 68321 6851 



6630 catgcctcaggaaacttgg | 90721 9091 



6631 



6632 



ccagctgtctgcgccctcc 6959 6974 



acactccaggccaataggc 9401 9420 



6633dccagttaactcctggct 88201 8839 



6634gctgcgccatcacaacatg | 77021 7721 



6635 gagccgcatgactgcagag { 95651 9584 



6636 aacatcttgggggggtggg 57711 5790 



6637 ccaatcgatgaacggggag | 9378| 9397 



6638 cggcgccaaactattccaa 65641 6583 



6639 gcccgaaccggacgtagca I 6832) 6851 



6640 


gcgagcggcgtgcfgacga 


8453 


8472 






6641 


gccacgacatcccgcagcg 


7727 
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6642 


cctagactctttcgagccg 


7111 


71 3q 




3 


6643 


cgcccaactcgctcccccc 


5794 


5813 




3 


6644 


aagggaggccgcaagccag 


8063 


8082 






6645 


gaagggaggccgcaagcca 


8062 


8081 




3 


6646 


agagcgtcgtctgctgctc 


7596 


7615 


1 


3 


6647 


gcccatctacacgctccca 


4019 
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SEQ ID NO: 
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SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 



61 06|acagatcgccggacatgtc 



61 07acgtggcatggaacattcc 



6 1 08 gggcccctgcacgccttcc 



6 1 09 agtgcccatgtcaggttcc 



61 1 0tgcccatgtcaggttccag 



61 1 1 cagctcctgagtttttcac 



61 12 tcacggaggtggaiggggt 



6113cacggaggtggatggggtg 



6114 gacccctcccacattacag 



6115 ttggocagggggtctcccc 

6116 ccttgagggcgacatgcac 



6122 



61 1 7 ggagatgggcggaaacatc 



61 1 8 gagatgggcggaaacatca 



6442 



6506 



6544 



6675 



6677 



6693 



6708 



6709 



6872 



6911 



6972 



7060 



7061 



61 1 9 ctagactctttcgagccgc 



61 20 tagactctttcgagccgct 



61 21 agaatgaaatatccattgc 



tgcggcggagatcctgcg 



61 23 agcgaggaggctggtgaga 



61 24 tgagagcgtcgtctgctgc 



6126gtcgtctgctgctcaatgt 



61 26 tgcgccatcacaacatggt 



6 1 27 cagaagaaggtcacctttg 

6 1 28 cctggatgaccattaccgg 

6129 ggacgtgcttaaggagatg 

6 1 30 aaagaatgaggttttctgc 

61 31 agttcgtgtatgcgagaag 



61 32ggctataaaatcgctcaca 



61 33 ttctccatccttclagctc 



61 34 tgtctcgtgcccgaccccg 



7112 



7113 



7149 



7164 



7580 



7594 



7601 



7704 



7757 



7789 



6461 [SEQ ID NO: 6652|gacatatatcacagoctgt 



652^ SEQ ID NO: 6653 ggaagaacccggactacgt 



6563 SEQ ID NO: 6654ggaagaaagcaagctgccc 



6694 SEQ ID NO: 665^ggaaacagctagacacact 



6696SEQ ID NO: 6656 



6712 SEQ ID NO: 6667gtgagagcgtcgtctgctg 



ctgggcgcgctgacgggca 



6727 SEQ ID NO: 6658 acccttcctcaagccgtga 



6728 SEQ ID NO: 
6891 SEQ ID NO: 



6930 SEQ ID NO: 
6991 SEQ ID NO: 



665g cacccttcctcaagccgtg 
6660 ctgttttgactcaacggtc 



6661 



ggggtgggtagccgcccaa 



6662gtgcttaaggagatgaagg 



9287 



9306 



7257 



7276 



7660 



7679 



8803 



6822 



3164 



3183 



7593 



7612 



8153 



8172 



8152 



8278 



8171 
8297 



5782 



5801 



7079 SEQ ID NO: 6663 gatgacccatttcttctcc 



7080 SEQ ID NO: 
7131 SEQ ID NO: 



6664 tgatgacccatttcttctc 



71 32 SEQ ID NO: 6666 agcgacgggtcttggtcta 



7 168 SEQ ID NO: 



7183SEQ ID NO: 



6665 gcggcgtgctgacgactag 



7599SEQ ID NO: 6669 



6667 gcaaagaatgaggttttct 
6668cgcacgatgcatctggcaa 



tc 



761 3 SEQ ID NO: 6670gcagtaaagaccaagctca 



ctcgtgcccgacccogct 



7620 SEQ ID NO: 6671 acatggtctacgccacgac 



7723 SEQ ID NO: 6672 accatgtctcccccacgca 



7776 SEQ ID NO: 6673 caaagaatgaggttttctq 



8032 



7808 SEQ ID NO: 6674 ccggaacctatccagcagg 



7807 7826 SEQ ID NO: 6675 catcgggccaggagcgtcc 



8110 



8051 SEQ ID NO: 6676 gcagaagaaggtcaccttt 



8365 



8900 



9303 



81 29 SEQ ID NO: 6677 cttcatgcctcaggaaact 



8384SEQIDNO: 6678 



8^ SEQ ID NO: 6679 gagcggagggggatgagaa 



tgtgaaaggtccgtgagcc 



9322| SEQ ID NO: 6680[cggggcgcgttccctgaca 



7811 



7830 



8887 



8906 



8886 



8905 



8457 



8476 



7556 



7575 



8030 



8049 



8730 



8749 



9305 



9324 



9197 



9216 



7716 



6123 



7735 
6142' 



8031 



7936 



8050 
7955 



9116 



9135 



7756 



9069 



7775 



9088 



9551 



9570 



7134 



7153 



3688 



3707 
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Table 14. Sequences j&om human hepatitis C virus fHCV^ (Direct Match Type) 





Source 


Start 


End 




Match 


Start 


End 


Match 






Index 


Index 






Index 


Index 


# 


SEQ ID NO: 6755 


ttttttttttttttttttt 


9446 


9465 


SEQ ID NO:6758 


ttttttttttttttttttt 


9466 


9485 


2 


SEQ ID NO: 6756 


ttttttttttttttttttt 


9446 


9465 


SEQ ID NO:6769 


ttttttttttttttttttt 


9465 


9484 


1 


SEQ ID NO: 6757 


ttttttttttttttttttt 


9447 


9466 


SEQ ID NO:6760 


ttttttttttttttttttt 


9466 


9485 


1 



Table 15. Sequences of Exemplary Gene Targets 

gi|4502152|ref |NM_000384.11 Homo sapiens apolipoprotein B (including Ag(x) 
antigen) (APOB) , mRNA 

attcccaccgggacctgcggggctgagtgcccttctcggttgctgccgctgaggagcccgcccagccagc 
cagggccgcgaggccgaggccaggccgcagcccaggagccgccccaccgcagctggcgatggacccgccg 
aggcccgcgctgctggcgctgctggcgctgcctgcgctgctgctgctgctgctggcgggcgccagggccg 
aagaggaaatgctggaaaatgtcagcctggtctgtccaaaagatgcgacccgattcaagcacctccggaa 
gtacacatacaactatgaggctgagagttccagtggagtccctgggactgctgattcaagaagtgccacc 
aggatcaactgcaaggttgagctggaggttccccagctctgcagcttcatcctgaagaccagccagtgca 
ccctgaaagaggtgtatggcttcaaccctgagggcaSagccttgctgaagaaaaccaagaactctgagga 
gtttgctgcagccatgtccaggtatgagctcaagctggccattccagaagggaagcaggttttcctttac 
ccggagaaagatgaacctacttacatcctgaacatcaagaggggcatcatttctgccctcctggttcccc 
cagagacagaagaagccaagcaagtgttgtttctggataccgtgtatggaaactgctccactcactttac 

CGTCAAGACGAGG7\AGGGCAATGTGGCAACAGAAATATCCACTGAAAGAGACCTGGGGCAGTGTGATCGC 

ttcaagcccatccgcacaggcatcagcccacttgctctcatcaaaggcatgacccgccccttgtcaactc 
tgatcagcagcagccagtcctgtcagtacacactggacgctaagaggaagcatgtggcagaagccatctg 
caaggagcaacacctcttcctgcctttctcctacaacaataagtatgggatggtagcacaagtgacacag 
actttgaaacttgaagacacaccaaagatcaacagccgcttctttggtgaaggtactaagaagatgggcc 
tcgcatttgagagcaccaaatccacatcacctccaaagcaggccgaagctgttttgaagactctccagga 
actgaaaaaactaaccatctctgagcaaaatatccagagagctaatctcttcaataagctggttactgag 
ctgagaggcctcagtgatgaagcagtcacatctctcttgccacagctgattgaggtgtccagccccatca 
ctttacaagccttggttcagtgtggacagcctcagtgctccactcacatcctccagtggctgaaacgtgt 
gcatgccaacccccttctgatagatgtggtcacctacctggtggccctgatccccgagccctcagcacag 
cagctgcgagagatcttcaacatggcgagggatcagcgcagccgagccaccttgtatgcgctgagccacg 
cggtcaacaactatcataagacaaaccctacagggacccaggagctgctggacattgctaattacctgat 
ggaacagattct^gatgactgcactggggatgaagattacacctatttgattctgcgggtcattggaaat 
atgggccaaaccatggagcagttaactccagaactcaagtcttcaatcctcaaatgtgtccaaagtacaa 
agccatcactgatgatccagaaagctgccatccaggctctgcggaaaatggagcctaaagacaaggacca 
ggaggttcttcttcagactttccttgatgatgcttctccgggagataagcgactggctgcctatcttatg 
ttgatgaggagtccttcacaggcagatattaacaaaattgtccaaattctaccatgggaacagaatgagc 
aagtgaagaactttgtggcttcccatattgccaatatcttgaactcagaagaattggatatccaagatct 
gaaaaagttagtgaaagaagctctgaaagaatctcaacttccaactgtcatggacttcagaaaattctct 
cggaactatctvactctacaaatctgtttctcttccatcacttgacccagcctcagccaaaatagaaggga 
atcttatatttgatccaaataactaccttcctaaagaaagcatgctgaaaactaccctcactgcctttgg 
atttgcttcagctgacctcatcgagattggcttggaaggaaaaggctttgagccaacattggaagctctt 
tttgggaagcaaggatttttcccagacagtgtcaacaaagctttgtactgggttaatggtcaagttcctg 
atggtgtctctaaggtcttagtggaccactttggctataccaaagatgataaacatgagcaggatatggt 
7vaatggaataatgctcagtgttgagaagctgattaaagatttgaaatccaaagaagtcccggaagccaga 
gcctacctccgcatcttgggagaggagcttggttttgccagtctccatgacctccagctcctggg/^gc 
tgcttctgatgggtgcccgcactctgcaggggatcccccagatgattggagaggtcatcaggaagggctc 
aaagaatgacttttttcttcactacatcttcatggagaatgcctttgaactccccactggagctggatta 
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CAGTTGCAAATATCTTCATCTGGAGTCATTGCTCCCGGAGCCAAGGCTGGAGTAAAACTGGAAGTAGCCA 

ACATGCAGGCTGAACTGGTGGCAAAACCCTCCGTGTCTGTGGAGTTTGTGACAAATATGGGCATCATCAT 

TCCGGACTTCGCTAGGAGTGGGGTCCAGATGAACACCAACTTCTTCCACGAGTCGGGTCTGGAGGCTCAT 

GTTGCCCTAAAAGCTGGGAAGCTGAAGTTTATCATTCCTTCCCCAAAGAGACCAGTCAAGCTGCTCAGTG 

GAGGCAACACATTACATTTGGTCTCTACCACCAAAACGGAGGTGATCCCACCTCTCATTGAGAACAGGCA 

GTCCTGGTCAGTTTGCAAGCAAGTCTTTCCTGGCCTGAATTACTGCACCTCAGGCGCTTACTCCAACGCC 

AGCTCCACAGACTCCGCCTCCTACTATCCGCTGACCGGGGACACCAGATTAGAGCTGGAACTGAGGCCTA 

CAGGAGAGATTGAGCAGTATTCTGTCAGCGCAACCTATGAGCTCCAGAGAGAGGACAGAGCCTTGGTGGA 

TACCCTGAAGTTTGTAACTCAAGCAGAAGGTGCGAAGCAGACTGAGGCTACCATGACATTCAAATATAAT 

CGGCAGAGTATGACCTTGTCCAGTGAAGTCCAAATTCCGGATTTTGATGTTGACCTCGGAACAATCCTCA 

GAGTTAATGATGAATCTACTGAGGGCAAAACGTCTTACAGACTCACCCTGGACATTCAGAACAAGAAAAT 

TACTGAGGTCGCCCTCATGGGCCACCTAAGTTGTGACACAAAGGAAGAAAGAAAAATCAAGGGTGTTATT 

TCCATACCCCGTTTGCAAGCAGAAGCCAGAAGTGAGATCCTCGCCCACTGGTCGCCTGCCAAACTGCTTC 

TCCAAATGGACTCATCTGCTACAGCTTATGGCTCCACAGTTTCCAAGAGGGTGGCATGGCATTATGATGA 

AGAGAAGATTGAATTTGAATGGAACACAGGCACCAATGTAGATACCAAAAAAATGACTTCCAATTTCCCT 

GTGGATCTCTCCGATTATCCTAAGAGCTTGCATATGTATGCTAATAGACTCCTGGATCACAGAGTCCCTG 

AAACAGACATGACTTTCCGGCACGTGGGTTCCAAATTAATAGTTGCAATGAGCTCATGGCTTCAGAAGGC 

ATCTGGGAGTCTTCCTTATACCCAGACTTTGCAAGACCACCTCAATAGCCTGAAGGAGTTCAACCTCCAG 

AACATGGGATTGCCAGACTTCCACATCCCAGAAAACCTCTTCTTAAAAAGCGATGGCCGGGTCAAATATA 

CCTTGAACAAGAACAGTTTGAAAATTGAGATTCCTTTGCCTTTTGGTGGCAAATCCTCCAGAGATCTA7VA 

GATGTTAGAGACTGTTAGGACACCAGCCCTCCACTTCAAGTCTGTGGGATTCCATCTGCCATCTCGAGAG 

TTCCAAGTCCCTACTTTTACCATTCCCAAGTTGTATCAACTGCAAGTGCCTCTCCTGGGTGTTCTAGACC 

TCTCCACGAATGTCTACAGCAACTTGTACAACTGGTCCGCCTCCTACAGTGGTGGCAACACCAGCACAGA 

CCATTTCAGCCTTCGGGCTCGTTACCACATGAAGGCTGACTCTGTGGTTGACCTGCTTTCCTACAATGTG 

CAAGGATCTGGAGAAACAACATATGACCACAAGAATACGTTCACACTATCATGTGATGGGTCTCTACGCC 

AC7VAATTTCTAGATTCGAATATCA7VATTCAGTCATGTAGAAAAACTTGGAAACAACCCAGTCTCAAAAGG 

TTTACTAATATTCGATGCATCTAGTTCCTGGGGACCACAGATGTCTGCTTCAGTTCATTTGGACTCCAAA 

AAGAAACAGCATTTGTTTGTCAAAGAAGTCAAGATTGATGGGCAGTTCAGAGTCTCTTCGTTCTATGCTA 

AAGGCACATATGGCCTGTCTTGTCAGAGGGATCCTAACACTGGCCGGCTCAATGGAGAGTCCAACCTGAG 

GTTTAACTCCTCCTACCTCCAAGGCACCAACCAGATAACAGG7VAGATATGAAGATGGAACCCTCTCCCTC 

ACCTCCACCTCTGATCTGCAAAGTGGCATCATTAAAAATACTGCTTCCCTAAAGTATGAGAACTACGAGC 

TGACTTTAAAATCTGACACC7VATGGGAAGTATAAGAACTTTGCCACTTCTAACAAGATGGATATGACCTT 

CTCTAAGCAAAATGCACTGCTGCGTTCTGAATATCAGGCTGATTACGAGTCATTGAGGTTCTTCAGCCTG 

CTTTCTGGATCACTAAATTCCCATGGTCTTGAGTTAAATGCTGACATCTTAGGCACTGACAAAATTAATA 

GTGGTGCTCACAAGGCGACACTAAGGATTGGCCAAGATGGAATATCTACCAGTGCAACGACCAACTTGAA 

GTGTAGTCTCCTGGTGCTGGAGAATGAGCTGAATGCAGAGCTTGGCCTCTCTGGGGCATCTATGAAATTA 

ACAACAAATGGCCGCTTCAGGGAACACAATGCAAAATTCAGTCTGGATGGGAAAGCCGCCCTCACAGAGC 

TATCACTGGGAAGTGCTTATCAGGCCATGATTCTGGGTGTCGACAGCAAAAACATTTTCAACTTCAAGGT 

CAGTCAAGAAGGACTTAAGCTCTCAAATGACATGATGGGCTCATATGCTGAAATGAAATTTGACCACACA 

AACAGTCTGAACATTGCAGGCTTATCACTGGACTTCTCTTCAAAACTTGACAACATTTACAGCTCTGACA 

AGTTTTATAAGCAAACTGTTAATTTACAGCTACAGCCCTATTCTCTGGTAACTACTTTAAACAGTGACCT 

GAAATACAATGCTCTGGATCTCACCAACAATGGGAAACTACGGCTAGAACCCCTGAAGCTGCATGTGGCT 

GGTAACCTAAAAGGAGCCTACCAAAATAATGAAATAAAACACATCTATGCCATCTCTTCTGCTGCCTTAT 

CAGCAAGCTATAAAGCAGACACTGTTGCTAAGGTTCAGGGTGTGGAGTTTAGCCATCGGCTCAACACAGA 

CATCGCTGGGCTGGCTTCAGCCATTGACATGAGCACAAACTATAATTCAGACTCACTGCATTTCAGCAAT 

GTCTTCCGTTCTGTAATGGCCCCGTTTACCATGACCATCGATGCACATACAAATGGCAATGGGAAACTCG 

CTCTCTGGGGAGAACATACTGGGCAGCTGTATAGCAAATTCCTGTTGAAAGCAGAACCTCTGGCATTTAC 

TTTCTCTCATGATTACAAAGGCTCCACAAGTCATCATCXCGTGTCTAGGAAAAGCATCAGTGCAGCTCTT 

GAACACAAAGTCAGTGCCCTGCTTACTCCAGCTGAGCAGACAGGCACCTGGAAACTCAAGACCCAATTTA 

ACAACAATGAATACAGCCAGGACTTGGATGCTTACAACACTAAAGATAAAATTGGCGTGGAGCTTACTGG 

ACGAACTCTGGCTGACCTAACTCTACTAGACTCCCCAATTAAAGTGCCACTTTTACTCAGTGAGCCCATC 

AATATCATTGATGCTTTAGAGATGAGAGATGCCGTTGAGAAGCCCCAAGAATTTACAATTGTTGCTTTTG 

TAAAGTATGATAAAAACCAAGATGTTCACTCCATTAACCTCCCATTTTTTGAGACCTTGCAAGAATATTT 

TGAGAGGAATCGACAAACCATTATAGTTGTAGTGGAAAACGTACAGAGAAACCTGAAGCACATCAATATT 

GATCAATTTGTAAGAAAATACAGAGCAGCCCTGGGAAAACTCCCACAGCAAGCTAATGATTATCTGAATT 

CATTCAATTGGGAGAGACAAGTTTCACATGCCAAGGAGAAACTGACTGCTCTCACAAAAAAGTATAGAAT 

TACAGAAAATGATATACAAATTGCATTAGATGATGCCAAAATCAACTTTAATGAAAAACTATCTCAACTG 
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CAGACATATATGATACAATTTGATCAGTATATTAAAGATAGTTATGATTTACATGATTTGAAAATAGCTA 
TTGCTAATATTATTGATGAAATCATTGAAAAATTAAAAAGTCTTGATGAGCACTATCATATCCGTGTAAA 
TTTAGTAAAAACAATCCATGATCTACATTTGTTTATTGAAAATATTGATTTTAACAAAAGTGGAAGTAGT 
ACTGCATCCTGGATTCAAAATGTGGATACTAAGTACCAAATCAGAATCCAGATACAAGAAAAACTGCAGC 
5 AGCTTAAGAGACACATACAG;^TATAGACATCCAGCACCTAGCTGGA7\AGTTAAAACAACACATTGAGGC 
TATTGATGTTAGAGTGCTTTTAGATCAATTGGGAACTACAATTTCATTTG/^GAATAAATGATGTTCTT 
GAGCATGTCAAACACTTTGTTATAAATCTTATTGGGGATTTTGAAGTAGCTGAGA7VAATCAATGCCTTCA 
GAGCCAAAGTCCATGAGTTAATCGAGAGGTATGAAGTAGACCAACAAATCCAGGTTTTAATGGATAAATT 
AGTAGAGTTGACCCACCAATACAAGTTG7VAGGAGACTATTCAGAAGCTAAGCAATGTCCTACAACAAGTT 

10 AAGATAAAAGATTACTTTGAGAAATTGGTTGGATTTATTGATGATGCTGTGAAGAAGCTTAATGAATTAT 
CTTTTAAAACATTCATTGAAGATGTTAACAAATTCCTTGACATGTTGATAAAGAAATTAAAGTCATTTGA 
TTACCACCAGTTTGTAGATGAAACCAATGACAAAATCCGTGAGGTGACTCAGAGACTCAATGGTGAAATT 
CAGGCTCTGGAACTACCACAAAAAGCTGAAGCATTAAAACTGTTTTTAGAGGAAACCAAGGCCACAGTTG 
CAGTGTATCTGGAAAGCCTACAGGACACCAAAATAACCTTAATCATCAATTGGTTACAGGAGGCTTTAAG 

15 TTCAGCATCTTTGGCTCACATGAAGGCC7\AATTCCGAGAGACTCTAGAAGATACACGAGACCGAATGTAT 
CA7VATGGACATTCAGCAGGAACTTC7\ACGATACCTGTCTCTGGTAGGCCAGGTTTATAGCACACTTGTCA 
CCTACATTTCTGATTGGTGGACTCTTGCTGCTAAGAACCTTACTGACTTTGCAGAGCAATATTCTATCCA 
AGATTGGGCTAAACGTATGAAAGCATTGGTAGAGCAAGGGTTCACTGTTCCTGTU^TCAAGACCATCCTT 
GGGACCATGCCTGCCTTTG7^GTCAGTCTTCAGGCTCTTCAGA7\AGCTACCTTCCAGACACCTGATTTTA 

20 TAGTCCCCCTAACAGATTTGAGGATTCCATCAGTTCAGATAAACTTCAAAGACTTAAAAAATATAAAAAT 
CCCATCCAGGTTTTCCACACCAGAATTTACCATCCTTAACACCTTCCACATTCCTTCCTTTACAATTGAC 
• TTTGTCGAAATGAAAGTAAAGATCATCAGAACCATTGACCAGATGCAGAACAGTGAGCTGCAGTGGCCCG 
TTCCAGATATATATCTCAGGGATCTGAAGGTGGAGGACATTCCTCTAGCGAGAATCACCCTGCCAGACTT 
CCGTTTACCAGAAATCGCAATTCCAGAATTCATAATCCCAACTCTCAACCTTAATGATTTTCAAGTTCCT 

26 GACCTTCACATACCAGAATTCCAGCTTCCCCACATCTCACACACAATTGAAGTACCTACTTTTGGCAAGC 
TATACAGTATTCTGAAAATCCAATCTCCTCTTTTCACATTAGATGCA7\ATGCTGACATAGGGAATGGAAC 
CACCTCAGCAAACGAAGCAGGTATCGCAGCTTCCATCACTGCCAAAGGAGAGTCCAAATTAGAAGTTCTC 
AATTTTGATTTTCAAGCAAATGCACAACTCTCAAACCCTAAGATTAATCCGCTGGCTCTGAAGGAGTCAG 
TGAAGTTCTCCAGCAAGTACCTGAGAACGGAGCATGGGAGTGAAATGCTGTTTTTTGGAAATGCTATTGA 

30 GGGAAAATCAAACACAGTGGCAAGTTTACACACAGAAAAAAATACACTGGAGCTTAGTAATGGAGTGATT 
GTCAAGATAAACAATCAGCTTACCCTGGATAGCAACACTAAATACTTCCACAAATTGAACATCCCCAAAC 
TGGACTTCTCTAGTCAGGCTGACCTGCGCAACGAGATCAAGACACTGTTGAAAGCTGGCCACATAGCATG 
GACTTCTTCTGGAAAAGGGTCATGGAAATGGGCCTGCCCCAGATTCTCAGATGAGGGAACACATGAATCA 
CAAATTAGTTTCACCATAGAAGGACCCCTCACTTCCTTTGGACTGTCCAATAAGATCAATAGCAAACACC 

35 TAAGAGTAAACCAAAACTTGGTTTATGAATCTGGCTCCCTCAACTTTTCTAAACTTGAAATTCAATCACA 
AGTCGATTCCCAGCATGTGGGCCACAGTGTTCTAACTGCTAAAGGCATGGCACTGTTTGGAGAAGGGAAG 
GCAGAGTTTACTGGGAGGCATGATGCTCATTTAAATGGAAAGGTTATTGGAACTTTGAAAAATTCTCTTT 
TCTTTTCAGCCCAGCCATTTGAGATCACGGCATCCACAAACAATGAAGGGAATTTGAAAGTTCGTTTTCC 
ATTAAGGTTAACAGGGAAGATAGACTTCCTGAATAACTATGCACTGTTTCTGAGTCCCAGTGCCCAGCAA 

40 GC7U\GTTGGCAAGTAAGTGCTAGGTTCAATCAGTATAAGTACAACCAAAATTTCTCTGCTGGAAACAACG 
AGAACATTATGGAGGCCCATGTAGGAATAAATGGAGAAGCT^TCTGGATTTCTTAAACATTCCTTTAAC 
AATTCCTGAAATGCGTCTACCTTACACAATTVATCACAACTCCTCCACTGAAAGATTTCTCTCTATGGGAA 
AAAACAGGCTTGAAGGAATTCTTGAAAACGACAAAGCAATCATTTGATTTAAGTGTA7WVGCTCAGTATA 
AGAAAAACAAACACAGGCATTCCATCACAAATCCTTTGGCTGTGCTTTGTGAGTTTATCAGTCAGAGCAT 

46 CAAATCCTTTGACAGGCATTTTGAAAAAAACAGAAACAATGCATTAGATTTTGTCACCAAATCCTATAAT 
GAAAC7W\AATTAAGTTTGATAAGTACAAAGCTGAAAAATCTCACGACGAGCTCCCCAGGACCTTTCAAA 
TTCCTGGATACACTGTTCCAGTTGTCAATGTTGAAGTGTCTCCATTCACCATAGAGATGTCGGCATTCGG 
CTATGTGTTCCCAAAAGCAGTCAGCATGCCTAGTTTCTCCATCCTAGGTTCTGACGTCCGTGTGCCTTCA 
TACACATTAATCCTGCCATCATTAGAGCTGCCAGTCCTTCATGTCCCTAGAAATCTCAAGCTTTCTCTTC 

50 CACATTTCAAGGAATTGTGTACCATAAGCCATATTTTTATTCCTGCCATGGGCAATATTACCTATGATTT 
CTCCTTTAAATCAAGTGTCATCACACTGAATACCAATGCTGAACTTTTTTVACCAGTCAGATATTGTTGCT 
CATCTCCTTTCTTCATCTTCATCTGTCATTGATGCACTGCAGTACAAATTAGAGGGCACCACAAGATTGA 
CAAGAAAAAGGGGATTGAAGTTAGCCACAGCTCTGTCTCTGAGCAACAAATTTGTGGAGGGTAGTCATAA 
CAGTACTGTGAGCTTAACCACGAAAAATATGGAAGTGTCAGTGGCAAAAACCACAAAAGCCGAAATTCCA 

55 ATTTTGAGAATGAATTTCAAGCAAGAACTTAATGGAAATACCAAGTCAAAACCTACTGTCTCTTCCTCCA 
TGGAATTTAAGTATGATTTCAATTCTTCAATGCTGTACTCTACCGCTA7VAGGAGCAGTTGACCACAAGCT 
TAGCTTGGAAAGCCTCACCTCTTACTTTTCCATTGAGTCATCTACCAAAGGAGATGTCAAGGGTTCGGTT 
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CTTTCTCGGGAATATTCAGGAACTATTGCTAGTGAGGCCAACACTTACTTGAATTCCAAGAGCACACGGT 
CTTCAGTGAAGCTGCAGGGCACTTCCAAAATTGATGATATCTGGAACCTTGAAGTAAAAGAAAATTTTGC 
TGGAGT^GCCACACTCCAACGCATATATTCCCTCTGGGAGCACAGTACGAAAAACCACTTACAGCTAGAG 
GGCCTCTTTTTCACCAACGGAGAACATACAAGCAAAGCCACCCTGGAACTCTCTCCATGGCAAATGTCAG 
5 CTCTTGTTCAGGTCCATGCAAGTCAGCCCAGTTCCTTCCATGATTTCCCTGACCTTGGCCAGGAAGTGGC 
CCTGAATGCTAACACTAAGAACCAGAAGATCAGATGGAAAAATGAAGTCCGGATTCATTCTGGGTCTTTC 
CAGAGCCAGGTCGAGCTTTCCAATGACCAAGAA7\AGGCACACCTTGACATTGCAGGATCCTTAGAAGGAC 
ACCTAAGGTTCCTCA7\AAATATCATCCTACCAGTCTATGACAAGAGCTTATGGGATTTCCTAAAGCTGGA 
TGTAACCACCAGCATTGGTAGGAGACAGCATCTTCGTGTTTCAACTGCCTTTGTGTACACCAAAAACCCC 

10 AATGGCTATTCATTCTCCATCCCTGTAAAAGTTTTGGCTGATA7\ATTCATTACTCCTGGGCTGAAACTAA 
ATGATCTAAATTCAGTTCTTGTCATGCCTACGTTCCATGTCCCATTTACAGATCTTCAGGTTCCATCGTG 
CAAACTTGACTTCAGAGAAATACAAATCTATAAGAAGCTGAGAACTTCATCATTTGCCCTCAACCTACCA 
ACACTCCCCGAGGTAAAATTCCCTGAAGTTGATGTGTTAACAAAATATTCTCAACCAGAAGACTCCTTGA 
TTCCCTTTTTTGAGATAACCGTGCCTGAATCTCAGTTAACTGTGTCCCAGTTCACGCTTCCAAAAAGTGT 

16 TTCAGATGGCATTGCTGCTTTGGATCTAAATGCAGTAGCCAACAAGATCGCAGACTTTGAGTTGCCCACC 
ATCATCGTGCCTGAGCAGACCATTGAGATTCCCTCCATTAAGTTCTCTGTACCTGCTGGAATTGTCATTC 
CTTCCTTTCAAGCACTGACTGCACGCTTTGAG6TAGACTCTCCCGTGTATAATGCCACTTGGAGTGCCAG 
TTTGAAAAACAAAGCAGATTATGTTGAAACAGTCCTGGATTCCACATGCAGCTCAACCGTACAGTTCCTA 
GAATATGAACTAAATGTTTTGGGAACACACAAAATCGAAGATGGTACGTTAGCCTCTAAGACTAAAGGAA 

20 CACTTGCACACCGTGACTTCAGTGCAGAATATGAAGAAGATGGCAAATTTGAAGGACTTCAGGAATGGGA 
AGGAAAAGCGCACCTCAATATCAAAAGCCCAGCGTTCACCGATCTCCATCTGCGCTACCAGAAAGACAAG 
AAAGGCATCTCCACCTCAGCAGCCTCCCCAGCCGTAGGCACCGTGGGCATGGATATGGATGAAGATGACG 
ACTTTTCTAAATGGAACTTCTACTACAGCCCTCAGTCCTCTCCAGATAAAAAACTCACCATATTCAAAAC 
TGAGTTGAGGGTCCGGGAATCTGATGAGGAAACTCAGATCAAAGTTAATTGGGAAGAAGAGGCAGCTTCT 

25 GGCTTGCTAACCTCTCTGAMGACAACGTGCCCAAGGCCACAGGGGTCCTTTATGATTATGTCAACAAGT 
ACCACTGGGAACACACAGGGCTCACCCTGAGAGAAGTGTCTTCAAAGCTGAGAAGAAATCTGCAGAACAA 
TGCTGAGTGGGTTTATCAAGGGGCCATTAGGCAAATTGATGATATCGACGTGAGGTTCCAGAAAGCAGCC 
AGTGGCACCACTGGGACCTACCAAGAGTGG7\AGGACAAGGCCCAGAATCTGTACCAGGAACTGTTGACTC 
AGGAAGGCCAAGCCAGTTTCCAGGGACTCAAGGATAACGTGTTTGATGGCTTGGTACGAGTTACTCAAAA 

30 ATTCCATATGAAAGTCAAGCATCTGATTGACTCACTCATTGATTTTCTGAACTTCCCCAGATTCCAGTTT 
CCGGGGAAACCTGGGATATACACTAGGGAGGAACTTTGCACTATGTTCATAAGGGAGGTAGGGACGGTAC 
TGTCCCAGGTATATTCGAAAGTCCATAATGGTTCAGAAATACTGTTTTCCTATTTCCAAGACCTAGTGAT 
TACACTTCCTTTCGAGTTAAGGAAACATAAACTAATAGATGTAATCTCGATGTATAGGGAACTGTTGAAA 
GATTTATCAAAAGAAGCCCAAGAGGTATTTAAAGCCATTCAGTCTCTCAAGACCACAGAGGTGCTACGTA 

35 ATCTTCAGGACCTTTTACAATTCATTTTCCAACTAATAGAAGATAACATTAAACAGCTG7U\AGAGATGAA 
ATTTACTTATCTTATTT^TTATATCCAAGATGAGATCAACACAATCTTCAATGATTATATCCCATATGTT 
TTTAAATTGTTGAAAGAAAACCTATGCCTTAATCTTCATAAGTTCAATGAATTTATTCAAAACGAGCTTC 
AGGAAGCTTCTCAAGAGTTACAGCAGATCCATCAATACATTATGGCCCTTCGTGAAGAATATTTTGATCC 
AAGTATAGTTGGCTGGACAGTGA7\ATATTATGAACTTGAAGAAAAGATAGTCAGTCTGATCAAGAACCTG 

40 TTAGTTGCTCTTAAGGACTTCCATTCTGAATATATTGTCAGTGCCTCTAACTTTACTTCCCAACTCTCAA 
GTCAAGTTGAGCAATTTCTGCACAGAAATATTCAGGAATATCTTAGCATCCTTACCGATCCAGATGGAAA 
AGGGAAAGAGAAGATTGCAGAGCTTTCTGCCACTGCTCAGGAAATAATTAAAAGCCAGGCCATTGCGACG 
AAGAAAATAATTTCTGATTACCACCAGCAGTTTAGATATAAACTGCAAGATTTTTCAGACCAACTCTCTG 
ATTACTATGAAAAATTTATTGCTGAATCCAAAAGATTGATTGACCTGTCCATTCAAAACTACCACACATT 

45 TCTGATATACATCACGGAGTTACTGAAT^AAGCTGCAATCAACCACAGTCATGAACCCCTACATGAAGCTT 
GCTCCAGGAGAACTTACTATCATCCTCTAATTTTTTAAAAGAAATCTTCATTTATTCTTCTTTTCCAATT 
GAACTTTCACATAGCACAGAAAAAATTCAAACTGCCTATATTGATAAAACCATACAGTGAGCCAGCCTTG 
CAGTAGGCAGTAGACTATAAGCAGAAGCACATATGAACTGGACCTGCACCAAAGCTGGCACCAGGGCTCG 
GAAGGTCTCTGAACTCAGAAGGATGGCATTTTTTGCAAGTTAAAGAAAATCAGGATCTGAGTTATTTTGC 

50 TAAACTTGGGGGAGGAGGAACAAATAAATGGAGTCTTTATTGTGTATCATA (SEQ ID NO: 6681) 



>giM557442|ref |NM___000078,1| Homo sapiens cholesteryl ester transfer 
protein, plasma (CETP) , mRNA 

GTGAATCTCTGGGGCCAGGAAGACCCTGCTGCCCGGAAGAGCCTCATGTTCCGTGGGGGCTGGGCGGACA 
56 TACATATACGGGCTCCAGGCTGAACGGCTCGGGCCACTTACACACCACTGCCTGATAACCATGCTGGCTG 
CCACAGTCCTGACCCTGGCCCTGCTGGGCAATGCCCATGCCTGCTCCAAAGGCACCTCGCACGAGGCAGG 
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CATCGTGTGCCGCATCACCAAGCCTGCCCTCCTGGTGTTGAACCACGAGACTGCCAAGGTGATCCAGACC 
GCCTTCCAGCGAGCCAGCTACCCAGATATCACGGGCGAGAAGGCCATGATGCTCCTTGGCCAAGTCAAGT 
ATGGGTTGCACAACATCCAGATCAGCCACTTGTCCATCGCCAGCAGCCAGGTGGAGCTGGTGGAAGCCAA 
GTCCATTGATGTCTCCATTCAGAACGTGTCTGTGGTCTTCAAGGGGACCCTGAAGTATGGCTACACCACT 
5 GCCTGGTGGCTGGGTATTGATCAGTCCATTGACTTCGAGATCGACTCTGCCATTGACCTCCAGATCAACA 
CACAGCTGACCTGTGACTCTGGTAGAGTGCGGACCGATGCCCCTGACTGCTACCTGTCTTTCCATAAGCT 
GCTCCTGCATCTCCAAGGGGAGCGAGAGCCTGGGTGGATCAAGCAGCTGTTCACAAATTTCATCTCCTTC 
ACCCTGAAGCTGGTCCTGAAGGGACAGATCTGCAAAGAGATCAACGTCATCTCTAACATCATGGCCGATT 
TTGTCCAGACAAGGGCTGCCAGCATCCTTTCAGATGGAGACATTGGGGTGGACATTTCCCTGACAGGTGA 

10 TCCCGTCATCACAGCCTCCTACCTGGAGTCCCATCACAAGGGTCATTTCATCTACAAGAATGTCTCAGAG 
GACCTCCCCCTCCCCACCTTCTCGCCCACACTGCTGGGGGACTCCCGCATGCTGTACTTCTGGTTCTCTG 
AGCGAGTCTTCCACTCGCTGGCCAAGGTAGCTTTCCAGGATGGCCGCCTCATGCTCAGCCTGATGGGAGA 
CGAGTTCAAGGCAGTGCTGGAGACCTGGGGCTTCAACACCAACCAGGAAATCTTCCAAGAGGTTGTCGGC 
GGCTTCCCCAGCCAGGCCCAAGTCACCGTCCACTGCCTCAAGATGCCCAAGATCTCCTGCCAAAACAAGG 

15 GAGTCGTGGTCAATTCTTCAGTGATGGTGAAATTCCTCTTTCCACGCCCAGACCAGC7\ACATTCTGTAGC 
TTACACATTTGAAGAGGATATCGTGACTACCGTCCAGGCCTCCTATTCTAAGAAAAAGCTCTTCTTAAGC 
CTCTTGGATTTCCAGATTACACCAAAGACTGTTTCCAACTTGACTGAGAGCAGCTCCGAGTCCATCCAGA 
GCTTCCTGCAGTCAATGATCACCGCTGTGGGCATCCCTGAGGTCATGTCTCGGCTCGAGGTAGTGTTTAC 
AGCCCTCATGAACAGCAAAGGCGTGAGCCTCTTCGACATCATCAACCCTGAGATTATCACTCGAGATGGC 

20 TTCCTGCTGCTGCAGATGGACTTTGGCTTCCCTGAGCACCTGCTGGTGGATTTCCTCCAGAGCTTGAGCT 
AGAAGTCTCCAAGGAGGTCGGGATGGGGCTTGTAGCAGAAGGCAAGCACCAGGCTCACAGCTGGAACCCT 
GGTGTCTCCTCCAGCGTGGTGGAAGTTGGGTTAGGAGTACGGAGATGGAGATTGGCTCCCAACTCCTCCC 
TATCCTAAAGGCCCACTGGCATTAAAGTGCTGTATCCAAG (SEQ ID NO: 6682) 



25 

>gi|414668|emb|X75500.1|HSMTP H. sapiens mRNA for microsomal triglyceride 
transfer protein 

TGCAGTTGAGGATTGCTGGTCAATATGATTCTTCTTGCTGTGCTTTTTCTCTGCTTCATTTCCTCATATT 
CAGCTTCTGTTAAAGGTCACACAACTGGTCTCTCATTT^AATAATGACCGGCTGTACAAGCTCACGTACTC 

30 CACTGAAGTTCTTCTTGATCGGGGCATU^GGAAAACTGCAAGACAGCGTGGGCTACCGCATTTCCTCCAAC 
GTGGATGTGGCCTTACTATGGAGGAATCCTGATGGTGATGATGACCAGTTGATCCAAATAACGATGAAGG 
ATGTA7\ATGTTGAAAATGTGAATCAGCAGAGAGGAGAGAAGAGCATCTTCAAAGGAAAAAGCCCATCTAA 
AATAATGGGAAAGGAAAACTTGGAAGCTCTGCAAAGACCTACGCTCCTTCATCTAATCCATGGAAAGGTC 
AAAGAGTTCTACTCATATCAAAATGAGGCAGTGGCCATAGAAAATATCAAGAGAGGTCTGGCTAGCCTAT 

35 TTCAGACACAGTTAAGCTCTGGAACCACCAATGAGGTAGATATCTCTGGAAATTGTAAAGTGACCTACCA 
GGCTCATCAAGACAAAGTGATCAAAATTAAGGCCTTGGATTCATGCAAAATAGCGAGGTCTGGATTTACG 
ACCCCAAATCAGGTCTTGGGTGTCAGTTCAAAAGCTACATCTGTCACCACCTAT7VAGATAGAAGACAGCT 
TTGTTATAGCTGTGCTTGCTGAAGAAACACACAATTTTGGACTGAATTTCCTACAAACCATTAAGGGGAA 
AATAGTATCGAAGCAGAAATTAGAGCTGAAGACAACCGAAGCAGGCCC7UVGATTGATGTCTGGAAAGCAG 

40 GCTGCAGCCATAATCAAAGCAGTTGATTCAAAGTACACGGCCATTCCCATTGTGGGGCAGGTCTTCCAGA 
GCCACTGTAAAGGATGTCCTTCTCTCTCGGAGCTCTGGCGGTCCACCAGGAAATACCTGCAGCCTGACAA 
CCTTTCCAAGGCTGAGGCTGTCAGAAACTTCCTGGCCTTCATTCAGCACCTCAGGACTGCGAAGAAAGAA 
GAGATCCTTCAAATACTAAAGATGGAAAATAAGGAAGTATTACCTCAGCTGGTGGATGCTGTCACCTCTG 
CTCAGACCTCAGACTCATTAGAAGCCATTTTGGACTTTTTGGATTTCAAAAGTGACAGCAGCATTATCCT 

45 CCAGGAGAGGTTTCTCTATGCCTGTGGATTTGCTTCTCATCCCAATGAAGAACTCCTGAGAGCCCTCATT 
AGTAAGTTCAAAGGTTCTATTGGTAGCAGTGACATCAGAGAAACTGTTATGATCATCACTGGGACACTTG 
TCAGAAAGTTGTGTCAGAATGAAGGCTGCAAACTCAAAGCAGTAGTGGAAGCTAAGAAGTTAATCCTGGG 
AGGACTTGAAAAAGCAGAGAAAAAAGAGGACACCAGGATGTATCTGCTGGCTTTGAAGAATGCCCTGCTT 
CCAGAAGGCATCCCAAGTCTTCTGAAGTATGCAGAAGCAGGAGAAGGGCCCATCAGCCACCTGGCTACCA 

60 CTGCTCTCCAGAGATATGATCTCCCTTTCATAACTGATGAGGTGAAGAAGACCTTAAACAGAATATACCA 
CCAAAACCGTAAAGTTCATGAAAAGACTGTGCGCACTGCTGCAGCTGCTATCATTTT7UVATAACAATCCA 
TCCTACATGGACGTCAAGAACATCCTGCTGTCTATTGGGGAGCTTCCCCAAGAAATGAATAAATACATGC 
TCGCCATTGTTCAAGACATCCTACGTTTTGAAATGCCTGCAAGCAAAATTGTCCGTCGAGTTCTGAAGGA 
AATGGTCGCTCACAATTATGACCGTTTCTCCAGGAGTGGATCTTCTTCTGCCTACACTGGCTACATAGAA 

55 CGTAGTCCCCGTTCGGCATCTACTTACAGCCTAGACATTCTCTACTCGGGTTCTGGCATTCTAAGGAGAA 
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GTAACCTGAACATCTTTCAGTACATTGGGAAGGCTGGTCTTCACGGTAGCCAGGTGGTTATTGAAGCCCA 
AGGACTGGAAGCCTTAATCGCAGCCACCCCTGACGAGGGGGAGGAGAACCTTGACTCCTATGCTGGTATG 
TCAGCCATCCTCTTTGATGTTCAGCTCAGACCTGTCACCTTTTTCAACGGATACAGTGATTTGATGTCCA 
AAATGCTGTCAGCATCTGGCGACCCTATCAGTGTGGTGAAAGGACTTATTCTGCTAATAGATCATTCTCA 
5 GGAACTTCAGTTACAATCTGGACTAAAAGCCAATATAGAGGTCCAGGGTGGTCTAGCTATTGATATTTCA 
GGTGCAATGGAGTTTAGCTTGTGGTATCGTGAGTCTAAAACCCGAGTGAAAAATAGGGTGACTGTGGTAA 
TAACCACTGACATCACAGTGGACTCCTCTTTTGTGAAAGCTGGCCTGGAAACCAGTACAGAAACAGAAGC 
AGGCTTGGAGTTTATCTCCACAGTGCAGTTTTCTCAGTACCCATTCTTAGTTTGCATGCAGATGGACAAG 
GATGAAGCTCCATTCAGGCAATTTGAGAAAAAGTACGAAAGGCTGTCCACAGGCAGAGGTTATGTCTCTC 

10 AGAAAAGAAT^GAAAGCGTATTAGCAGGATGTGAATTCCCGCTCCATCAAGAGAACTCAGAGATGTGCAA 
AGTGGTGTTTGCCCCTCAGCCGGATAGTACTTCCAGCGGATGGTTTTGAAACTGACCTGTGATATTTTAC 
TTGAATTTGTCTCCCCGAAAGGGACACAATGTGGCATGACTAAGTACTTGCTCTCTGAGAGCACAGCGTT 
TACATATTTACCTGTATTTAAGATTTTTGTAAAAAGCTACAATWVACTGCAGTTTGATCAAATTTGGGTA 
TATGCAGTATGCTACCCACAGCGTCATTTTGAATCATCATGTGACGCTTTCAACAACGTTCTTAGTTTAC 

15 TTATACCTCTCTCAAATCTCATTTGGTACAGTCAGAATAGTTATTCTCTAAGAGGAAACTAGTGTTTGTT 
AAAAACAAAAATAAAAACAJ\7VACCACACAAGGAGAACCCAATTTTGTTTCAACAATTTTTGATCAATGTA 
TATGAAGCTCTTGATAGGACTTCCTTAAGCATGACGGGAAAACCAAACACGTTCCCTAATCAGGAAAAAA 
JVAAAAAAAAAAAAAGTAAGACACAAACAAACCATTTTTTTCTCTTTTTTTGGAGTTGGGGGCCCAGGGAG 
AAGGGACAAGGCTTTTAAAAGACTTGTTAGCCAACTTCMGAATTAATATTTATGTCTCTGTTATTGTTA 

20 GTTTTAAGCCTTAAGGTAGAAGGCACATAGAAATAACATC (SEQ ID NO: 6683) 



>gi| 1217638|emblX91148.1|HSMTTP H. sapiens mRNA for microsomal triglyceride 
transfer protein 

TGCAGTTGAGGATTGCTGGTCAATATGATTCTTCTTGCTGTGCTTTTTCTCTGCTTCATTTCCTCATATT 

25 CAGCTTCTGTTAAAGGTCACACT^CTGGTCTCTCATTAAATAATGACCGGCTGTACAAGCTCACGTACTC 
CACTGAAGTTCTTCTTGATCGGGGCAAAGGAAAACTGCAAGACAGCGTGGGCTACCGCATTTCCTCCAAC 
GTGGATGTGGCCTTACTATGGAGGAATCCTGATGGTGATGATGACCAGTTGATCCAAATAACGATGAAGG 
ATGTAAATGTTGAAAATGTGAATCAGCAGAGAGGAGAGAAGAGCATCTTCAAAGGAA7\AAGCCCATCTAA 
AATAATGGGAAAGGAAAACTTGGAAGCTCTGCAAAGACCTACGCTCCTTCATCTAATCCATGGAAAGGTC 

30 AAAGAGTTCTACTCATATCAAAATGAGGCAGTGGCCATAGAAAATATCAAGAGAGGTCTGGCTAGCCTAT 
TTCAGACACAGTTAAGCTCTGGAACCACCAATGAGGTAGATATCTCTGGAAATTGTAAAGTGACCTACCA 
GGCTCATCAAGACAAAGTGATCAAAATTAAGGCCTTGGATTCATGCAAAATAGCGAGGTCTGGATTTACG 
ACCCCAAATCAGGTCTTGGGTGTCAGTTCAAAAGCTACATCTGTCACCACCTATAAGATAGAAGACAGCT 
TTGTTATAGCTGTGCTTGCTGAAGAAACACACAATTTTGGACTGAATTTCCTAC7\;\ACCATTAAGGGGAA 

35 AATAGTATCGAAGCAGAAATTAGAGCTGAAGACAACCGAAGCAGGCCCAAGATTGATGTCTGGAAAGCAG 
GCTGCAGCCATAATCAAAGCAGTTGATTCAAAGTACACGGCCATTCCCATTGTGGGGCAGGTCTTCCAGA 
GCCACTGTAAAGGATGTCCTTCTCTCTCGGAGCTCTGGCGGTCCACCAGGAAATACCTGCAGCCTGACAA 
CCTTTCCAAGGCTGAGGCTGTCAGAAACTTCCTGGCCTTCATTCAGCACCTCAGGACTGCGAAGAAAGAA 
GAGATCCTTCAAATACTAAAGATGGAA/^TAAGGAAGTATTACCTCAGCTGGTGGATGCTGTCACCTCTG 

40 CTCAGACCTCAGACTCATTAGAAGCCATTTTGGACTTTTTGGATTTCAAAAGTGACAGCAGCATTATCCT 
CCAGGAGAGGTTTCTCTATGCCTGTGGATTTGCTTCTCATCCCAATGAAGAACTCCTGAGAGCCCTCATT 
AGTAAGTTCAAAGGTTCTATTGGTAGCAGTGACATCAGAGAAACTGTTATGATCATCACTGGGACACTTG 
TCAGAAAGTTGTGTCAGAATGAAGGCTGCAAACTCA7\AGCAGTAGTGGAAGCTAAGAAGTTAATCCTGGG 
AGGACTTGAAAAAGCAGAGAAAAAAGAGGACACCAGGATGTATCTGCTGGCTTTGAAGAATGCCCTGCTT 

46 CCAG7VAGGCATCCCAAGTCTTCTGAAGTATGCAGAAGCAGGAGAAGGGCCCATCAGCCACCTGGCTACCA 
CTGCTCTCCAGAGATATGATGCTCCCTTTCATAACTGATGAGGTGAAGAAGACCTTA7UVCAGAATATACC 
ACCAAAACCGTAAAGTTCATGA7UVAGACTGTGCGCACTGCTGCAGCTGCTATCATTTTAAATAACAATCC 
ATCCTACATGGACGTCAAGAACATCCTGCTGTCTATTGGGGAGCTTCCCCAAGAAATGAATAAATACATG 
CTCGCCATTGTTCAAGACATCCTACGTTTTGAAATGCCTGCAAGCAAAATTGTCCGTCGAGTTCTGAAGG 

50 AAATGGTCGCTCACAATTATGACCGTTTCTCCAGGAGTGGATCTTCTTCTGCCTACACTGGCTACATAGA 
ACGTAGTCCCCGTTCGGCATCTACTTACAGCCTAGACATTCTCTACTCGGGTTCTGGCATTCT7\AGGAGA 
AGTAACCTGAACATCTTTCAGTACATTGGGAAGGCTGGTCTTCACGGTAGCCAGGTGGTTATTGAAGCCC 
AAGGACTGGAAGCCTTAATCGCAGCCACCCCTGACGAGGGGGAGGAGAACCTTGACTCCTATGCTGGTAT 
GTCAGCCATCCTCTTTGATGTTCAGCTCAGACCTGTCACCTTTTTCAACGGATACAGTGATTTGATGTCC 

65 AAAATGCTGTCAGCATCTGGCGACCCTATCAGTGTGGTGAAAGGACTTATTCTGCTAATAGATCATTCTC 
AGGAACTTCAGTTACAATCTGGACTAAAAGCCAATATAGAGGTCCAGGGTGGTCTAGCTATTGATATTTC 
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AGGTGCAATGGAGTTTAGCTTGTGGTATCGTGAGTCTAAAACCCGAGTGAAAAATAGGGTGACTGTGGTA 
ATAACCACTGACATCACAGTGGACTCCTCTTTTGTGAAAGCTGGCCTGGAAACCAGTACAGAAACAGAAG 
CAGGCTTGGAGTTTATCTCCACAGTGCAGTTTTCTCAGTACCCATTCTTAGTTTGCATGCAGATGGACAA 
GGATGAAGCTCCATTCAGGCAATTTGAGAA7\AAGTACGAAAGGCTGTCCACAGGCAGAGGTTATGTCTCT 
5 CAGAAAAG7\AAAGAAAGCGTATTAGCAGGATGTGAATTCCCGCTCCATCAAGAGAACTCAGAGATGTGCA 
AAGTGGTGTTTGCCCCTCAGCCGGATAGTACTTCCAGCGGATGGTTTTGA7UVCTGACCTGTGATATTTTA 
CTTGAATTTGTCTCCCCGAAAGGGACACAATGTGGCATGACTAAGTACTTGCTCTCTGAGAGCACAGCGT 
TTACATATTTACCTGTATTTAAGATTTTTGTAAAAAGCTACAAAAAACTGCAGTTTGATCAAATTTGGGT 
ATATGCAGTATGCTACCCACAGCGTCATTTTGAATCATCATGTGACGCTTTCAACAACGTTCTTAGTTTA 

10 CTTATACCTCTCTCAAATCTCATTTGGTACAGTCAGAATAGTTATTCTCTAAGAGGAAACTAGTGTTTGT 
TAAAAACAAAAATAAAAACAAAACCACACAAGGAGAACCCAATTTTGTTTCAACAATTTTTGATCAATGT 
ATATGAAGCTCTTGATAGGACTTCCTTAAGCATGACGGGAAAACCAZ\ACACGTTCCCTAATCAGGAAAAA 
AAAAAAAAAAGAAAAAGTAAGACACAAACAAACCATTTTTTTCTCTTTTTTTGGAGTTGGGGGCCCAGGG 
AGAAGGGACAAGGCTTTTAAAAGACTTGTTAGCCAACTTCAAGAATTAATATTTATGTCTCTGTTATTGT 

16 TAGTTTTAAGCCTTAAGGTAGAAGGCACATAGAAATAACATCTCATCTTTCTGCTGACCATTTTAGTGAG 
- GXTGTTCCA7\AGAGCATTCAGGTCTCTACCTCCAGCCCTGCAAAAATATTGGACCTAGCACAGAGGAATC 
AGGAAAATTAATTTCAGAAACTCCATTTGATTTTTCTTTTGCTGTGTCTTTTTTGAGACTGT7\ATATGGT 
ACACTGTCCTCTAAGGACATCCTCATTTTATCTCACCTTTTTGGGGGTGAGAGCTCTAGTTCATTTAACT 
GTACTCTGCACAATAGCTAGGATGACTAAGAGAACATTGCTTCAAGAAACTGGTGGATTTGGATTTCCAA 

20 AATATGAAATAAGGAGAAAAATGTTTTTATTTGTATGAATTAAAAGATCCATGTTGAACATTTGCAAATA 
TTTATTAATAAACAGATGTGGTGATAAACCCAAAACAAATGACAGGTGCTTATTTTCCACTAAACACAGA 
CACATGAAATGAAAGTTTAGCTAGCCCACTATTTGTTGTAAATTGAAAACGAAGTGTGATAAAATAAATA 
TGTAGAAATCAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 6684) 



25 

>gi|21361125|ref |NM_0014 67.21 Homo sapiens glucose-6-phosphatase, 
transport ( glucose- 6-phosphate) protein 1 (G6PT1), mRNA 
GGCACGAGGGGCCACCGAGGCGCTGTCCCTGACCACCAGCACGAGACCCCTTTCTATCGCGCCAGTCCTG 
TGGTCTCCGCACCTCTCCAGCTCCTGCACCCCCGGCCCCCGTGGTTCCCAGCCGCACAGTAGCGTGTCCT 

30 GGGTAGCGTGAGGACCCACGGGGCTGAGCAGGTGCCACGAGCCCGCCGCCTCTTCGCCGCCCGCCGCCTC 
TCCTCCTCTCCCGCCCGCCGCCTGGCCCTCCCCTACCAGGCTGAGCCTCTGGCTGCCAGAAGCGCGGGGC 
CTCCGGGAGAATACGTGCGGTCGCCCGCTCCGCGTGCGCCTACGCCTTCTGCTCCAGTTGCTTTCCCAAT 
TGAGCGGAAAAGCCGGGGCATGTTGCCGGGGCCCTGGGCGGGACGGTTGTGCCCTGCAGCCCGAAGCCCG 
CCGGGGCACCTTCCCGCCCACGAGCTGCCCAGTCCCTCTGCTTGCGGCCCCTGCCAACGTCCCACAGGAC 

35 ACTGGGTCCCCTTGGAGCCTCCCCAGGCTTAATGATTGTCCAGAAGGCGGCTATAAAGGGAGCCTGGGAG 
GCTGGGTGGAGGAGGGAGCAGAAAAAACCCAACTCAGCAGATCTGGGAACTGTGAGAGCGGCAAGCAGGA 
ACTGTGGTCAGAGGCTGTGCGTCTTGGCTGGTAGGGCCTGCTCTTTTCTACCATGGCAGCCCAGGGCTAT 
GGCTATTATCGCACTGTGATCTTCTCAGCCATGTTTGGGGGCTACAGCCTGTATTACTTCAATCGCAAGA 
CCTTCTCCTTTGTCATGCCATCATTGGTGGAAGAGATCCCTTTGGACAAGGATGATTTGGGGTTCATCAC 

40 CAGCAGCCAGTCGGCAGCTTATGCTATCAGCAAGTTTGTCAGTGGGGTGCTGTCTGACCAGATGAGTGCT 
CGCTGGCTCTTCTCTTCTGGGCTGCTCCTGGTTGGCCTGGTCAACATATTCTTTGCCTGGAGCTCCACAG 
TACCTGTCTTTGCTGCCCTCTGGTTCCTTAATGGCCTGGCCCAGGGGCTGGGCTGGCCCCCATGTGGGT^ 
GGTCCTGCGGAAGTGGTTTGAGCCATCTCAGTTTGGCACTTGGTGGGCCATCCTGTCAACCAGCATGAAC 
CTGGCTGGAGGGCTGGGCCCTATCCTGGCAACCATCCTTGCCCAGAGCTACAGCTGGCGCAGCACGCTGG 

45 CCCTATCTGGGGCACTGTGTGTGGTTGTCTCCTTCCTCTGTCTCCTGCTCATCCACT^TGAACCTGCTGA 
TGTTGGACTCCGCAACCTGGACCCCATGCCCTCTGAGGGCAAGAAGGGCTCCTTGAAGGAGGAGAGCACC 
CTGCAGGAGCTGCTGCTGTCCCCTTACCTGTGGGTGCTCTCCACTGGTTACCTTGTGGTGTTTGGAGTAA 
AGACCTGCTGTACTGACTGGGGCCAGTTCTTCCTTATCCAGGAGAAAGGACAGTCAGCCCTTGTAGGTAG 
CTCCTACATGAGTGCCCTGGAAGTTGGGGGCCTTGTAGGCAGCATCGCAGCTGGCTACCTGTCAGACCGG 

50 GCCATGGCAAAGGCGGGACTGTCCTVACTACGGGAACCCTCGCCATGGCCTGTTGCTGTTCATGATGGCTG 
GCATGACAGTGTCCATGTACCTCTTCCGGGTAACAGTGACCAGTGACTCCCCC7VAGCTCTGGATCCTGGT 
ATTGGGAGCTGTATTTGGTTTCTCCTCGTATGGCCCCATTGCCCTGTTTGGAGTCATAGCCAACGAGAGT 
GCCCCTCCCAACTTGTGTGGCACCTCCCACGCCATTGTGGGACTCATGGCCAATGTGGGCGGCTTTCTGG 
CTGGGCTGCCCTTCAGCACCATTGCCAAGCACTACAGTTGGAGCACAGCCTTCTGGGTGGCTGJ\AGTGAT 

55 TTGTGCGGCCAGCACGGCTGCCTTCTTCCTCCTACGAAACATCCGCACCAAGATGGGCCGAGTGTCCAAG 
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AAGGCTGAGTG/^GAGAGTCCAGGTTCCGGAGCACCATCCCACGGTGGCCTTCCCCCTGCACGCTCTGCG 
GGGAGTIAAAGGAGGGGCCTGCCTGGCTAGCCCTGAACCTTTCACTTTCCATTTCTGCGCCTTTTCTGTCA 
CCCGGGTGGCGCTGGAAGTTATCAGTGGCTAGTGAGGTCCCAGCTCCCTGATCCTATGCTCTATTTAAAA 
GATAACCTTTGGCCTTAGACTCCGTTAGCTCCTATTTCCTGCCTTCAGACAAACAGGAAACTTCTGCAGT 
CAGGAAGGCTCCTGTACCCTTCTTCTTTTCCTAGGCCCTGTCCTGCCCGCATCCTACCCCATCCCCACCT 
GAAGTGAGGCTATCCCTGCAGCTGCAGGGCACTAATGACCCTTGACTTCTGCTGGGTCCTAAGTCCTCTC 
AGCAGTGGGTGACTGCTGTTGCCAATACCTCAGACTCCAGGGAAAGAGAGGAGGCCATCATTCTCACTGT 

ACCACTAGGCGCAGTTGGATATAGGTGGGAAGAAAAGGTGACTTGTTATAGAAGATTAAAACTAGATTTG 
ATACTGAAAAAAAAAAAAflAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 6685) 



gi|4503130|ref |NM_001904.1| Homo sapiens catenin (cadherin-associated 
protein), beta 1, 88kDa (CTNNBl), mRNA 

AAGCCTCTCGGTCTGTGGCAGCAGCGTTGGCCCGGCCCCGGGAGCGGAGAGCGAGGGGAGGCGGAGACGG 
AGGAAGGTCTGAGGAGCAGCTTCAGTCCCCGCCGAGCCGCCACCGCAGGTCGAGGACGGTCGGACTCCCG 
CGGCGGGAGGAGCCTGTTCCCCTGAGGGTATTTGAAGTATACCATACAACTGTTTTGAAAATCCAGCGTG 
GACAATGGCTACTCAAGCTGATTTGATGGAGTTGGACATGGCCATGGAACCAGACAGAAAAGCGGCTGTT 
AGTCACTGGCAGCAACAGTCTTACCTGGACTCTGGAATCCATTCTGGTGCCACTACCACAGCTCCTTCTC 
TGAGTGGTAAAGGCAATCCTGAGGAAGAGGATGTGGATACCTCCCAAGTCCTGTATGAGTGGGAACAGGG 
ATTTTCTCAGTCCTTCACTCAAGAACAAGTAGCTGATATTGATGGACAGTATGCAATGACTCGAGCTCAG 
AGGGTACGAGCTGCTATGTTCCCTGAGACATTAGATGAGGGCATGCAGATCCCATCTACACAGTTTGATG 
CTGCTCATCCCACTAATGTCCAGCGTTTGGCTGAACCATCACAGATGCTGAAACATGCAGTTGTAAACTT 
GATTAACTATCAAGATGATGCAG7VACTTGCCACACGTGCAATCCCTGAACTGACAAAACTGCTAAATGAC 
GAGGACCAGGTGGTGGTTAATAAGGCTGCAGTTATGGTCCATCAGCTTTCTAAAAAGGAAGCTTCCAGAC 
ACGCTATCATGCGTTCTCCTCAGATGGTGTCTGCTATTGTACGTACCATGCAGAATACAAATGATGTAGA 
AACAGCTCGTTGTACCGCTGGGACCTTGCATAACCTTTCCCATCATCGTGAGGGCTTACTGGCCATCTTT 
AAGTCTGGAGGCATTCCTGCCCTGGTGAA/yVTGCTTGGTTCACCAGTGGATTCTGTGTTGTTTTATGCCA 
TTACAACTCTCCACAACCTTTTATTACATCAAGAAGGAGCTAAAATGGCAGTGCGTTTAGCTGGTGGGCT 
GCAGAAAATGGTTGCCTTGCTCAACAAAACAAATGTTAAATTCTTGGCTATTACGACAGACTGCCTTCAA 
ATTTTAGCTTATGGCAACCAAGAAAGCAAGCTCATCATACTGGCTAGTGGTGGACCCCAAGCTTTAGTAA 
ATATAATGAGGACCTATACTTACGAAAAACTACTGTGGACCACAAGCAGAGTGCTGAAGGTGCTATCTGT 
CTGCTCTAGTAATAAGCCGGCTATTGTAGAAGCTGGTGGAATGCAAGCTTTAGGACTTCACCTGACAGAT 
CCAAGTCAACGTCTTGTTCAGAACTGTCTTTGGACTCTCAGGAATCTTTCAGATGCTGCAACTAAACAGG 
AAGGGATGGAAGGTCTCCTTGGGACTCTTGTTCAGCTTCTGGGTTCAGATGATATAAATGTGGTCACCTG 
TGCAGCTGGAATTCTTTCTAACCTCACTTGCAATAATTATAAGAACAAGATGATGGTCTGCCAAGTGGGT 
GGTATAGAGGCTCTTGTGCGTACTGTCCTTCGGGCTGGTGACAGGGAAGACATCACTGAGCCTGCCATCT 
GTGCTCTTCGTCATCTGACCAGCCGACACCAAGAAGCAGAGATGGCCCAGAATGCAGTTCGCCTTCACTA 
TGGACTACCAGTTGTGGTTAAGCTCTTACACCCACCATCCCACTGGCCTCTGATAAAGGCTACTGTTGGA 
•TTGATTCGAAATCTTGCCCTTTGTCCCGCAAATCATGCACCTTTGCGTGAGCAGGGTGCCATTCCACGAC 
TAGTTCAGTTGCTTGTTCGTGCACATCAGGATACCCAGCGCCGTACGTCCATGGGTGGGACACAGCAGCA 
ATTTGTGGAGGGGGTCCGCATGGAAGAAATAGTTGAAGGTTGTACCGGAGCCCTTCACATCCTAGCTCGG 
GATGTTCACAACCGAATTGTTATCAGAGGACTAAATACCATTCCATTGTTTGTGCAGCTGCTTTATTCTC 
CCATTGAAAACATCCAAAGAGTAGCTGCAGGGGTCCTCTGTGAACTTGCTCAGGACAAGGAAGCTGCAGA 
AGCTATTGAAGCTGAGGGAGCCACAGCTCCTCTGACAGAGTTACTTCACTCTAGGAATGAAGGTGTGGCG 
ACATATGCAGCTGCTGTTTTGTTCCGAATGTCTGAGGACAAGCCACAAGATTACAAGAAACGGCTTTCAG 
TTGAGCTGACCAGCTCTCTCTTCAGAACAGAGCCAATGGCTTGGAATGAGACTGCTGATCTTGGACTTGA 
TATTGGTGCCCAGGGAGAACCCCTTGGATATCGCCAGGATGATCCTAGCTATCGTTCTTTTCACTCTGGT 
GGATATGGCCAGGATGCCTTGGGTATGGACCCCATGATGGAACATGAGATGGGTGGCCACCACCCTGGTG 
CTGACTATCCAGTTGATGGGCTGCCAGATCTGGGGCATGCCCAGGACCTCATGGATGGGCTGCCTCCAGG 
TGACAGCAATCAGCTGGCCTGGTTTGATACTGACCTGTAAATCATCCTTTAGCTGTATTGTCTGAACTTG 
CATTGTGATTGGCCTGTAGAGTTGCTGAGAGGGCTCGAGGGGTGGGCTGGTATCTCAGAAAGTGCCTGAC 
ACACTAACCAAGCTGAGTTTCCTATGGGAACAATTGAAGTAAACTTTTTGTTCTGGTCCTTTTTGGTCGA 
GGAGTAACAATACAAATGGATTTTGGGAGTGACTCAAGAAGTGAAGAATGCACAAGAATGGATCACAAGA 
TGGAATTTAGCAAACCCTAGCCTTGCTTGTTAAAATTTTTTTTTTTTTTTTTTT7VAGAATATCTGTAATG 
GTACTGACTTTGCTTGCTTTGAAGTAGCTCTTTTTTTTTTTTTTTTTTTTTTTTTTTGCAGTAACTGTTT 
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TTTAAGTCTCTCGTAGTGTTAAGTTATAGTGAATACTGCTACAGCAATTTCTAATTTTTAAGAATTGAGT 

AATGGTGTAGAACACTAATTAATTCATAATCACTCTAATTAATTGTAATCTGAATAAAGTGTAACAATTG 

TGTAGCCTTTTTGTATAAAATAGACAAATAGAAAATGGTCCAATTAGTTTCCTTTTTAATATGCTTAAAA 

TAAGCAGGTGGATCTATTTCATGTTTTTGATCAAAAACTATTTGGGATATGTATGGGTAGGGTAAATCAG 

6 TAAGAGGTGTTATTTGGAACCTTGTTTTGGACAGTTTACCAGTTGCCTTTTATCCCAAAGTTGTTGTAAC 

CTGCTGTGATACGATGCTTCAAGAGAAAATGCGGTTATAAAAAATGGTTCAGAATTAAACTTTTAATTCA 
TT (SEQ ID NO: 6686) 



10 gi|18104977|ref |NM_002827.2| Homo sapiens protein tyrosine phosphatase, 
non-receptor type 1 (PTPNl), mRNA 

GTGATGCGTAGTTCCGGCTGCCGGTTGACATGAAGAAGCAGCAGCGGCTAGGGCGGCGGTAGCTGCAGGG 
GTCGGGGATTGCAGCGGGCCTCGGGGCTAAGAGCGCGACGCGGCCTAGAGCGGCAGACGGCGCAGTGGGC 
CGAGAAGGAGGCGCAGCAGCCGCCCTGGCCCGTCATGGAGATGGAAAAGGAGTTCGAGCAGATCGACAAG 

15 TCCGGGAGCTGGGCGGCCATTTACCAGGATATCCGACATGAAGCCAGTGACTTCCCATGTAGAGTGGCCA 
AGCTTCCTAAGAACAAAAACCGAAATAGGTACAGAGACGTCAGTCCCTTTGACCATAGTCGGATTAAACT 
ACATCAAGAAGATAATGACTATATCAACGCTAGTTTGATAAAAATGGAAGAAGCCC-AAAGGAGTTACATT 
CTTACCCAGGGCCCTTTGCCTAACACATGCGGTCACTTTTGGGAGATGGTGTGGGAGCAGAAAAGCAGGG 
GTGTCGTCATGCTCAACAGAGTGATGGAGAAAGGTTCGTTAAAATGCGCACAATACTGGCCACAAAAAGA 

20 AGAAAAAGAGATGATCTTTGAAGACACAAATTTGAAATTAACATTGATCTCTGAAGATATCAAGTCATAT 
TATACAGTGCGACAGCTAGAATTGGAAAACCTTACAACCCAAGAAACTCGAGAGATCTTACATTTCCACT 
ATACCACATGGCCTGACTTTGGAGTCCCTGAATCACCAGCCTCATTCTTGAACTTTCTTTTCAAAGTCCG 
AGAGTCAGGGTCACTCAGCCCGGAGCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGGCAGGTCT 
GGAACCTTCTGTCTGGCTGATACCTGCCTCTTGCTGATGGACAAGAGGAAAGACCCTTCTTCCGTTGATA 

25 TCAAGAAAGTGCTGTTAGAT^TGAGGAAGTTTCGGATGGGGCTGATCCAGACAGCCGACCAGCTGCGCTT 
CTCCTACCTGGCTGTGATCGAAGGTGCCAAATTCATCATGGGGGACTCTTCCGTGCAGGATCAGTGGAAG 
GAGCTTTCCCACGAGGACCTGGAGCCCCCACCCGAGCATATCCCCCCACCTCCCCGGCCACCCAAACGAA 
TCCTGGAGCCACACAATGGGAAATGCAGGGAGTTCTTCCCAAATCACCAGTGGGTGAAGGAAGAGACCCA 
GGAGGATAAAGACTGCCCCATCAAGGAAGAAAAAGGAAGCCCCTTAAATGCCGCACCCTACGGCATCGAA 

30 AGCATGAGTCAAGACACTGAAGTTAGAAGTCGGGTCGTGGGGGGAAGTCTTCGAGGTGCCCAGGCTGCCT 
CCCCAGCCAAAGGGGAGCCGTCACTGCCCGAGAAGGACGAGGACCATGCACTGAGTTACTGGAAGCCCTT 
CCTGGTCAACATGTGCGTGGCTACGGTCCTCACGGCCGGCGCTTACCTCTGCTACAGGTTCCTGTTCAAC 
AGCTVACACATAGCCTGACCCTCCTCCACTCCACCTCCACCCACTGTCCGCCTCTGCCCGCAGAGCCCACG 
CCCGACTAGCAGGCATGCCGCGGTAGGTAAGGGCCGCCGGACCGCGTAGAGAGCCGGGCCCCGGACGGAC 

35 GTTGGTTCTGCACTAAAACCCATCTTCCCCGGATGTGTGTCTCACCCCTCATCCTTTTACTTTTTGCCCC 
TTCCACTTTGAGTACCAAATCCACAAGCCATTTTTTGAGGAGAGTGAAAGAGAGTACCATGCTGGCGGCG 
CAGAGGGAAGGGGCCTACACCCGTCTTGGGGCTCGCCCCACCCAGGGCTCCCTCCTGGAGCATCCCAGGC 
GGGCGGCACGCCAACAGCCCCCCCCTTGAATCTGCAGGGAGCAACTCTCCACTCCATATTTATTTAAACA 
ATTTTTTCCCCAAAGGCATCCATAGTGCACTAGCATTTTCTTGAACCAATAATGTATTAAAATTTTTTGA 

40 TGTCAGCCTTGCATCAAGGGCTTTATCAAAAAGTACT^TAATAAATCCTCAGGTAGTACTGGGAATGGAA 
GGCTTTGCCATGGGCCTGCTGCGTCAGACCAGTACTGGGAAGGAGGACGGTTGTAAGCAGTTGTTATTTA 
GTGATATTGTGGGTAACGTGAGAAGATAGAACAATGCTATAATATAT7VATGAACACGTGGGTATTTAATA 
AGAAACATGATGTGAGATTACTTTGTCCCGCTTATTCTCCTCCCTGTTATCTGCTAGATCTAGTTCTCAA 
TCACTGCTCCCCCGTGTGTATTAGAATGCATGTAAGGTCTTCTTGTGTCCTGATGAAAAATATGTGCTTG 

45 AAATGAGAAACTTTGATCTCTGCTTACTAATGTGCCCCATGTCCAAGTCCAACCTGCCTGTGCATGACCT 
GATCATTACATGGCTGTGGTTCCTAAGCCTGTTGCTGAAGTCATTGTCGCTCAGCAATAGGGTGCAGTTT 
TCCAGGAATAGGCATTTGCCTAATTCCTGGCATGACACTCTAGTGACTTCCTGGTGAGGCCCAGCCTGTC 
CTGGTACAGCAGGGTCTTGCTGTAACTCAGACATTCCTW^GGGTATGGGAAGCCATATTCACACCTCACGC 
TCTGGACATGATTTAGGGAAGCAGGGACACCCCCCGCCCCCCACCTTTGGGATCAGCCTCCGCCATTCCA 

50 AGTCAACACTCTTCTTGAGCAGACCGTGATTTGGAAGAGAGGCACCTGCTGGAAACCACACTTCTTGAAA 
CAGCCTGGGTGACGGTCCTTTAGGCAGCCTGCCGCCGTCTCTGTCCCGGTTCACCTTGCCGAGAGAGGCG 
CGTCTGCCCCACCCTCAAACCCTGTGGGGCCTGATGGTGCTCACGACTCTTCCTGCAAAGGGAACTGAAG 
ACCTCCACATTAAGTGGCTTTTTAACATGAAAAACACGGCAGCTGTAGCTCCCGAGCTACTCTCTTGCCA 
GCATTTTCACATTTTGCCTTTCTCGTGGTAGAAGCCAGTACAGAGAAATTGTGTGGTGGGAACATTCGAG 

55 GTGTCACCCTGCAGAGCTATGGTGAGGTGTGGATAAGGCTTAGGTGCCAGGCTGTAAGCATTCTGAGCTG 
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GGCTTGTTGTTTTTAAGTCCTGTATATGTATGTAGTAGTTTGGGTGTGTATATATAGTAGCATTTCAAAA 

TGGACGTACTGGTTTAACCTCCTATCCTTGGAGAGCAGCTGGCTCTCCACCTTGTTACACATTATGTTAG 

AGAGGTAGCGAGCTGCTCTGCTATATGCCTTAAGCCAATATTTACTCATCAGGTCATTATTTTTTACAAT 
GGCCATGGAATAAACCATTTTTACAAAA (SEQ ID NO: 6687) 

5 



gi 1 12831192 I gb I AF333324.il Hepatitis C virus type lb polyprotein mRNA, 
complete cds 

GCCAGCCCCCGATTGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCA 

10 GAAAGCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCA 
TAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCCG 
CTCAATGCCTGGAGATTTGGGCGTGCCCCCGC6AGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCC 
TTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCATCATGAGCACA 
T^ATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACAGGACGTTAAGTTCCCGGGCG 

15 GTGGTCAGATCGTTGGTGGAGTTTACCTGTTGCCGCGCAGGGGCCCCAGGTTGGGTGTGCGCGCGACTAG 
GAAGACTTCCGAGCGGTCGCAACCTCGTGGAAGGCGACAACCTATCCCCAAGGCTCGCCGGCCCGAGGGT 
AGGACCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAACGAGGGTATGGGGTGGGCAGGATGGC 
TCCTGTCACCCCGTGGCTCTCGGCCTAGTTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGTAATTTGGG 
TAAGGTCATCGATACCCTTACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTTGTCGGCGCCCCC 

20 CTAGGAGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAGGACGGCGTGAACTATGCAACAG 
GGAATCTGCCCGGTTGCTCTTTCTCTATCTTCCTCTTAGCTTTGCTGTCTTGTTTGACCATCCCAGCTTC 
CGCTTACGAGGTGCGCAACGTGTCCGGGATATACCATGTCACGAACGACTGCTCCAACTCAAGTATTGTG 
TATGAGGCAGCGGACATGATCATGCACACCCCCGGGTGCGTGCCCTGCGTCCGGGAGAGTAATTTCTCCC 
GTTGCTGGGTAGCGCTCACTCCCACGCTCGCGGCCAGGAACAGCAGCATCCCCACCACGACAATACGACG 

25 CCACGTCGATTTGCTCGTTGGGGCGGCTGCTCTCTGTTCCGCTATGTACGTTGGGGATCTCTGCGGATCC 
GTTTTTCTCGTCTCCCAGCTGTTCACCTTCTCACCTCGCCGGTATGAGACGGTACAAGATTGCAATTGCT 
CAATCTATCCCGGCCACGTATCAGGTCACCGCATGGCTTGGGATATGATGATGAACTGGTCACCTACAAC 
GGCCCTAGTGGTATCGCAGCTACTCCGGATCCCACAAGCCGTCGTGGACATGGTGGCGGGGGCCCACTGG 
GGTGTCCTAGCGGGCCTTGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTCTTGAT-TGTGATGCTAC 

30 TCTTTGCTGGCGTTGACGGGCACACCCACGTGACAGGGGGAAGGGTAGCCTCCAGCACCCAGAGCCTCGT 
GTCCTGGCTCTCACAAGGGCCATCTCAGAAAATCCAACTCGTGAACACCAACGGCAGCTGGCACATCAAC 
AGGACCGCTCTGAATTGCAATGACTCCCTCCAAACTGGGTTCATTGCTGCGCTGTTCTACGCACACAGGT 
TCAACGCGTCCGGATGTCCAGAGCGCATGGCCAGCTGCCGCCCCATCGACAAGTTCGCTCAGGGGTGGGG 
TCCCATCACTCACGTTGTGCCTAACATCTCGGACCAGAGGCCTTATTGCTGGCACTATGCACCCCAACCG 

35 TGCGGTATTGTACCfcGCGTCGCAGGTGTGTGGCCCAGTGTATTGCTTCACCCCGAGTCCTGTTGTGGTGG 
GGACGACCGACCGTTCCGGAGTCCCCACGTATAGCTGGGGGGAGAATGAGACAGACGTGCTGCTACTCAA 
CAACACGCGGCCGCCGCAAGGCAACTGGTTCGGCTGTACATGGATGAATAGCACCGGGTTCACCAAGACG 
TGCGGGGGCCCCCCGTGTAACATCGGGGGGGTTGGCAACAACACCTTGATTTGCCCCACGGATTGCTTCC 
GTVAAGCACCCCGAGGCCACTTACACCAAATGCGGCTCGGGTCCTTGGTTGACACCTAGGTGTCTAGTTGA 

40 CTACCCATACAGACTTTGGCACTACCCCTGCACTATCAATTTTACCATCTTCAAGGTCAGGATGTACGTG 
GGGGGCGTGGAGCACAGGCTCAACGCCGCGTGCAATTGGACCCGAGGAGAGCGCTGTGACCTGGAGGACA 
GGGATAGATCAGAGCTTAGCCCGCTGCTATTGTCTACAACGGAGTGGCAGGTACTGCCCTGTTCCTTTAC 
CACCCTACCGGCTCTGTCCACTGGATTGATCCACCTCCATCAGAATATCGTGGACGTGCAATACCTGTAC 
GGTGTAGGGTCAGTGGTTGTCTCCGTCGTAATCAAATGGGAGTATGTTCTGCTGCTCTTCCTTCTCCTGG 

46 CGGACGCGCGCGTCTGTGCCTGCTTGTGGATGATGCTGCTGATAGCCCAGGCTGAGGCCACCTTAGAGAA 
CCTGGTGGTCCTCAATGCGGCGTCTGTGGCCGGAGCGCATGGCCTTCTCTCCTTCCTCGTGTTCTTCTGC 
GCCGCCTGGTACATCAAAGGCAGGCTGGTCCCTGGGGCGGCATATGCTCTCTATGGCGTATGGCCGTTGC 
TCCTGCTCTTGCTGGCTTTACCACCACGAGCTTATGCCATGGACCGAGAGATGGCTGCATCGTGCGGAGG 
CGCGGTTTTTGTAGGTCTGGTACTCTTGACCTTGTCACCATACTATAAGGTGTTCCTCGCTAGGCTCATA 

50 TGGTGGTTACAATATTTTATCACCAGGGCCGAGGCGCACTTGCAAGTGTGGGTCCCCCCTCTTAATGTTC 
GGGGAGGCCGCGATGCCATCATCCTCCTTACATGCGCGGTCCATCCAGAGCTAATCTTTGACATCACCAA 
ACTCCTGCTCGCCATACTCGGTCCGCTCATGGTGCTCCAAGCTGGCATAACCAGAGTGCCGTACTTCGTG 
CGCGCTCAAGGGCTCATTCATGCATGCATGTTAGTGCGGAAGGTCGCTGGGGGTCATTATGTCCAAATGG 
CCTTCATGAAGCTGGGCGCGCTGACAGGCACGTACATTTACAACCATCTTACCCCGCTACGGGATTGGGC 

55 CCACGCGGGCCTACGAGACCTTGCGGTGGCAGTGGAGCCCGTCGTCTTCTCCGACATGGAGACCAAGATC 
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ATCACCTGGGGAGCAGACACCGCGGCGTGTGGGGACATCATCTTGGGTCTGCCCGTCTCCGCCCGAAGGG 
GAAAGGAGATACTCCTGGGCCCGGCCGATAGTCTTGAAGGGCGGGGGTQGCGACTCCTCGCGCCCATCAC 
GGCCTACTCCCAACAGACGCGGGGCCTACTTGGTTGCATCATCACTAGCCTTACAGGCCGGGACAAGAAC 
CAGGTCGAGGGAGAGGTTCAGGTGGTTTCCACCGCAACACAATCCTTCCTGGCGACCTGCGTCAACGGCG 
5 TGTGTTGGACCGTTTACCATGGTGCTGGCTCAAAGACCTTAGCCGGCCCAAAGGGGCCAATCACCCAGAT 
GTACACTAATGTGGACCAGGACCTCGTCGGCTGGCAGGCGCCCCCCGGGGCGCGTTCCTTGACACCATGC 
ACCTGTGGCAGCTCAGACCTTTACTTGGTCACGAGACATGCTGACGTCATTCCGGTGCGCCGGCGGGGCG 
ACAGTAGGGGGAGCCTGCTCTCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCTTCGGGTGGTCCACTGCT 
CTGCCCTTCGGGGCACGCTGTGGGCATCTTCCGGGCTGCCGTATGCACCCGGGGGGTTGCGAAGGCGGTG 

10 GACTTTGTGCCCGTAGAGTCCATGGAAACTACTATGCGGTCTCCGGTCTTCACGGACAACTCATCCCCCC 
CGGCCGTACCGCAGTCATTTCAAGTGGCCCACCTACACGCTCCCACTGGCAGCGGCAAGAGTACTAAAGT 
GCCGGCTGCATATGCAGCCCAAGGGTACAAGGTGCTCGTCCTCAATCCGTCCGTTGCCGCTACCTTAGGG 
TTTGGGGCGTATATGTCTAAGGCACACGGTATTGACCCCAACATCAGAACTGGGGTAAGGACCATTACCA 
CAGGCGCCCCCGTCACATACTCTACCTATGGCAAGTTTCTTGCCGATGGTGGTTGCTCTGGGGGCGCTTA 

15 TGACATCATAATATGTGATGAGTGCCATTCAACTGACTCGACTACAATCTTGGGCATCGGCACAGTCCTG 
GACCAAGCGGAGACGGCTGGAGCGCGGCTTGTCGTGCTCGCCACCGCTACGCCTCCGGGATCGGTCACCG 
TGCCACACCCAAACATCGAGGAGGTGGCCCTGTCTAATACTGGAGAGATCCCCTTCTATGGCAAAGCCAT 
CCCCATTGAAGCCATCAGGGGGGGAAGGCATCTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAGCTC 
GCCGCAAAGCTGTCAGGCCTCGGT^TCAACGCTGTGGCGTATTACCGGGGGCTCGATGTGTCCGTCATAC 

20 CAACTATCGGAGACGTCGTTGTCGTGGCAACAGACGCTCTGATGACGGGCTATACGGGCGACTTTGACTC 
AGTGATCGACTGTAACACATGTGTCACCCAGACAGTCGACTTCAGCTTGGATCCCACCTTCACCATTGAG 
ACGACGACCGTGCCTCAAGACGCAGTGTCGCGCTCGCAGCGGCGGGGTAGGACTGGCAGGGGTAGGAGAG 
GCATCTACAGGTTTGTGACTCCGGGAGAACGGCCCTCGGGCATGTTCGATTCCTCGGTCCTGTGTGAGTG 
CTATGACGCGGGCTGTGCTTGGTACGAGCTCACCCCCGCCGAGACCTCGGTTAGGTTGCGGGCCTACCTG 

25 AACACACCAGGGTTGCCCGTTTGCCAGGACCACCTGGAGTTCTGGGAGAGTGTCTTCACAGGCCTCACCC 
ACATAGATGCACACTTCTTGTCCCAGACCAAGCAGGCAGGAGACAACTTCCCCTACCTGGTAGCATACCA 
AGCCACGGTGTGCGCCAGGGCTCAGGCCCCACCTCCATCATGGGATCAAATGTGGAAGTGTCTCATACGG 
CTGAAACCTACGCTGCACGGGCCAACACCCTTGCTGTACAGGCTGGGAGCCGTCCAAAATGAGGTCACCC 
TCACCCACCCCATAACCAAATACATCATGGCATGCATGTCGGCTGACCTGGAGGTCGTCACTAGCACCTG 

30 GGTGCTGGTGGGCGGAGTCCTTGCAGCTCTGGCCGCGTATTGCCTGACAACAGGCAGTGTGGTCATTGTG 
GGTAGGATTATCTTGTCCGGGAGGCCGGCTATTGTTCCCGACAGGGAGCTTCTCTACCAGGAGTTCGATG 
AAATGGAAGAGTGCGCCACGCACCTCCCTTACATTGAGCAGGGT^TGCAGCTCGCCGAGCAGTTCAAGCA 
GAAAGCGCTCGGGTTACTGCAAACAGCCACCAAACAAGCGGAGGCTGCTGCTCCCGTGGTGGAGTCCAAG 
TGGCGAGCCCTTGAGACATTCTGGGCGAAGCACATGTGGAATTTCATCAGCGGGATACAGTACTTAGCAG 

35 GCTTATCCACTCTGCCTGGGAACCCCGCAATAGCATCATTGATGGCATTCACAGCCTCTATCACCAGCCC 
GCTCACCACCCAAAGTACCCTCCTGTTTAACATCTTGGGGGGGTGGGTGGCTGCCCAACTCGCCCCCCCC 
AGCGCCGCTTCGGCTTTCGTGGGCGCCGGCATCGCCGGTGCGGCTGTTGGCAGCATAGGCCTTGGGAAGG 
TGCTTGTGGACATTCTGGCGGGTTATGGAGCAGGAGTGGCCGGCGCGCTCGTGGCCTTTAAGGTCATGAG 
CGGCGAGATGCCCTCTACCGAGGACCTGGTCAATCTACTTCCTGCCATCCTCTCTCCTGGCGCCCTGGTC 

40 GTCGGGGTCGTGTGTGCAGCAATACTGCGTCGGCACGTGGGTCCGGGAGAGGGGGCTGTGCAGTGGATGA 
ACCGGCTGATAGCGTTCGCCTCGCGGGGTAATCACGTTTCCCCCACGCACTATGTGCCTGAGAGCGACGC 
CGCAGCGCGTGTTACTCAGATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAAAGGCTCCACCAGTGG 
ATTAATGAGGACTGCTCCACACCGTGTTCCGGCTCGTGGCTAAGGGATGTTTGGGACTGGATATGCACGG 
TGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCAGCTACCGGGAGTCCCTTTTTTCTC 

45 GTGCCAACGCGGGTACT^AGGGAGTCTGGCGGGGAGACGGCATCATGCAAACCACCTGCCCATGTGGAGCA 
CAGATCACCGGACATGTCAAAAACGGTTCCATGAGGATCGTCGGGCCTAAGACCTGCAGCAACACGTGGC 
ATGGAACATTCCCCATCAACGCATACACCACGGGCCCCTGCACACCCTCTCCAGCGCCAAACTATTCTAG 
GGCGCTGTGGCGGGTGGCCGCTGAGGAGTACGTGGAGGTCACGCGGGTGGGGGATTTCCACTACGTGACG 
GGCATGACCACTGACAACGTAAAGTGCCCATGCCAGGTTCCGGCTCCTGAATTCTTCTCGGAGGTGGACG 

50 GAGTGCGGTTGCACAGGTACGCTCCGGCGTGCAGGCCTCTCCTACGGGAGGAGGTTACATTCCAGGTCGG 
GCTCAACCTVATACCTGGTTGGGTCACAGCTACCATGCGAGCCCGAACCGGATGTAGCAGTGCTCACTTCC 
ATGCTCACCGACCCCTCCCACATCACAGCAGAAACGGCTAAGCGTAGGTTGGCCAGGGGGTCTCCCCCCT 
CCTTGGCCAGCTCTTCAGCTAGCCAGTTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCACCATGT 
CTCTCCGGACGCTGACCTCATCGAGGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGGAACATCACCCGC 

65 GTGGAGTCGGAGAACAAGGTGGTAGTCCTGGACTCTTTCGACCCGCTTCGAGCGGAGGAGGATGAGAGGG 
AAGTATCCGTTCCGGCGGAGATCCTGCGGAAATCCAAGAAGTTCCCCGCAGCGATGCCCATCTGGGCGCG 
CCCGGATTACAACCCTCCACTGTTAGAGTCCTGGAAGGACCCGGACTACGTCCCTCCGGTGGTGCACGGG 
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TGCCCGTTGCCACCTATCAAGGCCCCTCCAATACCACCTCCACGGAGAAAGAGGACGGTTGTCCTAACAG 
AGTCCTCCGTGTCTTCTGCCTTAGCGGAGCTCGCTACTAAGACCTTCGGCAGCTCCGAATCATCGGCCGT 
CGACAGCGGCACGGCGACCGCCCTTCCTGACCAGGCCTCCGACGACGGTGACAAAGGATCCGACGTTGAG 
TCGTACTCCTCCATGCCCCCCCTTGAGGGGGAACCGGGGGACCCCGATCTCAGTGACGGGTCTTGGTCTA 
6 CCGTGAGCGAGGAAGCTAGTGAGGATGTCGTCTGCTGCTCAATGTCCTACACATGGACAGGCGCCTTGAT 
CACGCCATGCGCTGCGGAGGAAAGCAAGCTGCCCATCAACGCGTTGAGCAACTCTTTGCTGCGCCACCAT 
AACATGGTTTATGCCACAACATCTCGCAGCGCAGGCCTGCGGCAGAAGAAGGTCACCTTTGACAGACTGC 
AAGTCCTGGACGACCACTACCGGGACGTGCTCAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTAA 
ACTCCTATCCGTAGAGGAAGCCTGCAAGCTGACGCCCCCACATTCGGCCAAATCCAAGTTTGGCTATGGG 

10 GCAAAGGACGTCCGGAACCTATCCAGCAAGGCCGTTAACCACATCCACTCCGTGTGGT^GGACTTGCTGG 
AAGACACTGTGACACCAATTGACACCACCATCATGGCAAAAAATGAGGTTTTCTGTGTCCAACCAGAGAA 
AGGAGGCCGTAAGCCAGCCCGCCTTATCGTATTCCCAGATCTGGGAGTCCGTGTATGCGAGAAGATGGCC 
CTCTATGATGTGGTCTCCACCCTTCCTCAGGTCGTGATGGGCTCCTCATACGGATTCCAGTACTCTCCTG 
GGCAGCGAGTCGAGTTCCTGGTGAATACCTGGAAATCAAAGAAAAACCCCATGGGCTTTTCATATGACAC 

15 TCGCTGTTTCGACTCAACGGTCACCGAGAACGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGAC 
TTGGCCCCCGAAGCCAGACAGGCCATAAAATCGCTCACAGAGCGGCTTTATATCGGGGGTCCTCTGACTA 
ATTCAAAAGGGCAGAACTGCGGTTATCGCCGGTGCCGCGCGAGCGGCGTGCTGACGACTAGCTGCGGTAA 
CACCCTCACATGTTACTTGAAGGCCTCTGCAGCCTGTCGAGCTGCGAAGCTCCAGGACTGCACGATGCTC 
GTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGCGCGGGAACCCAAGAGGACGCGGCGAGCCTACGAG 

20 TCTTCACGGAGGCTATGACTAGGTACTCTGCCCCCCCCGGGGACCCGCCCCAACCAGAATACGACTTGGA 
GCTGATAACATCATGTTCCTCCAATGTGTCGGTCGCCCACGATGCATCAGGCAAAAGGGTGTACTACCTC 
ACCCGTGATCCCACCACCCCCCTCGCACGGGCTGCGTGGGAZ\ACAGCTAGACACACTCCAGTTAACTCCT 
GGCTAGGCAACATTATCATGTATGCGCCCACTTTGTGGGCAAGGATGATTCTGATGACTCACTTCTTCTC 
CATCCTTCTAGCACAGGAGCAACTTGTU^TVAAGCCCTGGACTGCCAGATCTACGGGGCCTGTTACTCCATT 

25 GAGCCACTTGACCTACCTCAGATCATTG7\ACGACTCCATGGCCTTAGCGCATTTTCACTCCATAGTTACT 
CTCCAGGTGAGATCAATAGGGTGGCTTCATGCCTCAGGAAACTTGGGGTACCACCCTTGCGAGTCTGGAG 
ACATCGGGCCAGGAGCGTCCGCGCTAGGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTAC 
CTCTTCAACTGGGCAGTGAAGACCAAACTCAAACTCACTCCAATCCCGGCTGCGTCCCAGCTGGACTTGT 
CCGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGTGCCCGACCCCGCTG 

30 GTTCATGCTGTGCCTACTCCTACTTTCTGTAGGGGTAGGCATCTACCTGCTCCCCAACCGATGAACGGGG 
AGCTAAACACTCCAGGCCAATAGGCCATTTCCTGTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCT 
TTTCCTTCTTTTTCCCTTTTTCTTTCTTCCTTCTTTAATGGTGGCTCCATCTTAGCCCTAGTCACGGCTA 
GCTGTGAAAGGTCCGTGAGCCGCATGACTGCAGAGAGTGCTGATACTGGCCTCTCTGCAGATCATGT 
(SEQ ID NO: 6688) 

35 

gil 30628 61 gb|M96362.1 I HPCUNKCDS Hepatitis C virus mRNA, complete cds 
TGCCAGCCCCCGATTGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGC 
AGAAAGCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCC 
ATAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCC 

40 GCTCAATGCCTGGAGATTTGGGCGTGCCCCCGCGAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGC 
CTTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCAC 
GAATCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACAGGATATTAAGTTCCCGGGC 
GGTGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCCAGGTTGGGTGTGCGCGCGACTA 
GGAAGACTTCCGAGCGGTCGCAACCTCGTGGAAGGCGACAGCCTATCCCCAAGGCTCGCCGGCCCGAGGG 

45 CAGGGCCTGGGCTCAGCCCGGGTACCCTTGGCCCCTCTATGGCAATGAGGGCTTGGGGTGGGCAGGATGG 
CTCCTGTCACCCCGCGGCTCCCGGCCTAGTTGGGGCCCCACGGACCCCCGGCGTAAGTCGCGTAATTTGG 
GTAAGGTCATCGACACCCTCACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTCGTCGGCGCCCC 
CCTAGGGGGCGTTGCCAGGGCCCTGGCACATGGTGTCCGGGTGCTGGAGGACGGCGTGAACTATGCAACA 
GGGAATCTGCCCGGTTGCTCTTTCTCTATCTTCCTCTTGGCTCTGCTGTCTTGTTTGACCACCCCAGTTT 

50 CCGCTTATGAAGTGCGTAACGCGTCCGGGATGTACCATGTCACGAACGACTGCTCCAACTCAAGCATTGT 
GTATGAGGCAGCGGACATGATCATGCACACTCCCGGGTGCGTGCCCTGCGTTCGGGAGGACAACTCCTCC 
CGTTGCTGGGTGGCACTTACTCCCACGCTCGCGGCCAGGAATGCCAGCGTCCCCACTACGACATTGCGAC 
GCCATGTCGACTTGCTCGTTGGGGTAGCTGCTTTCTGTTCCGCTATGTACGTGGGGGACCTCTGCGGATC 
TGTTTTCCTTGTTTCCCAGCTGTTCACCTTTTCGCCTCGCCGGCATGAGACGGTACAGGACTGCAACTGC 

55 TCAATCTATCCCGGCCGCGTATCAGGTCACCGCATGGCCTGGGATATGATGATGT^CTGGTCGCCTACAA 
CAGCCCTAGTGGTATCGCAGCTACTCCGGATCCCACAAGCTGTCGTGGACATGGTGACAGGGTCCCACTG 



342 



wo 2004/080406 



PCTAJS2004/007070 



GGG7VATCCTGGCGGGCCTTGCCTACTATTCCATGGTGGGGAACTGGGCTAAGGTCTTAATTGCGATGCTA 
CTCTTTGCCGGCGTTGACGGAACCACCCACGTGACAGGGGGGGCGCAAGGTCGGGCCGCTAGCTCGCTJVA 
CGTCCCTCTTTAGCCCTGGGCCGGTTCAGCACCTCCAGCTCATAAACACCAACGGCAGCTGGCATATCAA 
CAGGACCGCCCTGAGCTGCAATGACTCCCTCAACACTGGGTTTGTTGCCGCGCTGTTCTACAAATACAGG 
6 TTCAACGCGTCCGGGTGCCCGGAGCGCTTGGCCACGTGCCGCCCCATTGATACATTCGCGCAGGGGTGGG 
GTCCCATCACTTACACTGAGCCTCATGATTTGGATCAGAGGCCCTATTGCTGGCACTACGCGCCTCAACC 
GTGTGGTATTGTGCCCACGTTGCAGGTGTGTGGCCCAGTATACTGCTTCACCCCGAGTCCTGTTGCGGTG 
GGGACTACCGATCGTTTCGGTGCCCCTACATACAGATGGGGGGCAAATGAGACGGACGTGCTGCTCCTTA 
ACAACGCCGGGCCGCCGCAAGGCAACTGGTTCGGCTGTACATGGATGAATGGCACTGGGTTCACCAAGAC 

10 ATGTGGGGGCCCCCCGTGTAACATCGGGGGGGTCGGCAACAATACCTTGACCTGCCCCACGGACTGCTTC 
CGAAAGCACCCCGGGGCCACTTACACCAAATGCGGTTCGGGGCCTTGGTTAACACCCAGGTGCTTAGTCG 
ACTACCCGTACAGGCTCTGGCATTACCCCTGCACTGTCAACTTTACCATCTTTAAGGTTAGGATGTACGT 
GGGGGGCGCGGAGCACAGGCTCGACGCCGCATGCAACTGGACTCGGGGAGAGCGTTGTGACCTGGAGGAC 
AGGGATAGGTCAGAGCTTAGCCCGCTGCTGCTGTCTACAACAGAGTGGCAGGTACTGCCCTGTTCCTTCA 

16 CAACCCTACCGGCTCTGTCCACTGGTTTGATTCATCTCCATCAGAACATCGTGGACATACAATACCTGTA 
CGGTATAGGGTCGGCGGTTGTCTCCTTTGCGATCAAATGGGAGTATATTGTGCTGCTCTTCCTTCTTCTG 
GCGGACGCGCGCGTCTGCGCTTGCTTGTGGATGATGCTGCTGGTAGCGCAAGCCGAGGCCGCCTTAGAGA 
ACCTGGTGGTCCTC7\ATGCAGCGTCCGTGGCCGGAGCGCATGGCATTCTTTCCTTCATTGTGTTCTTCTG 
TGCTGCCTGGTACATCAAGGGCAGGCTGGTTCCCGGAGCGGCATACGCCCTCTATGGCGTATGGCCGCTG 

20 CTTCTGCTTCTGCTGGCGTTACCACCACGGGCGTACGCCATGGACCGGGAGATGGCCGCATCGTGCGGAG 
GCGCGGTTTTTGTAGGTCTGGTACTCTTGACCTTGTCACCACACTATAAAGTGTTCCTTGCCAGGTTCAT 
ATGGTGGCTACAATATCTCATCACCAGAACCGAAGCGCATCTGCAAGTGTGGGTCCCCCCTCTCAACGTT 
CGGGGGGGTCGCGATGCCATCATCCTCCTCACATGCGTGGTCCACCCAGAGCTAATCTTTGACATCACAA 
AATATTTGCTCGCCATATTCGGCCCGCTCATGGTGCTCCAGGCCGGCATAACTAGAGTGCCGTACTTCGT 

25 GCGCGCACAAGGGCTCATTCGTGCATGCATGTTGGCGCGGAAAGTCGTGGGGGGTCATTACGTCCAAATG 
GTCTTCATGAAGCTGGCCGCACTAGCAGGTACGTACGTTTATGACCATCTTACTCCACTGCGAGATTGGG 
CTCACACGGGCTTACGAGACCTTGCAGTGGCAGTAGAGCCCGTTGTCTTCTCTGACATGGAGACCATVAGT 
CATCACCTGGGGGGCAGACACCGCGGCGTGCGGGGACATCATCTTGGCCCTGCCTGCTTCCGCCCGAAGG 
GGGAAGGAGATACTTCTGGGACCGGCCGATAGTCTTGAAGGACAGGGGTGGCGACTCCTTGCGCCCATCA 

30 CGGCCTACTCCCAACAAACGCGAGGCCTGCTTGGTTGCATCATCACTAGCCTTACAGGCCGGGACAAGAA 
CCAGGTTGAGGGGGAGGTTC/VAGTGGTTTCCACCGCAACACAATCTTTCCTGGCGACCTGCATCTVATGGC 
GTGTGTTGGACTGTCTTCCACGGCGCCGGCTCAAAGACCCTAGCCGGCCCAAAGGGTCCAATCACCCAAA 
TGTACACCAATGTAGACCAGGACCTTGTTGGCTGGCCGGCACCTCCTGGGGCGCGTTCCCTGACACCATG 
CACTTGCGGCTCCTCGGACCTTTACCTGGTCACGAGACATGCTGATGTCATTCCGGTGCGCCGGCGGGGT 

35 GACGGTAGGGGGAGCCTACTCCCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCCTCGGGTGGTCCACTGC 
TCTGCCCTTCGGGGCACGCTGTCGGCATACTTCCGGCTGCTGTATGCACCCGGGGGGTTGCCATGGCGGT 
GGAATTCATACCCGTTGAGTCTATGGAAACTACTATGCGGTCTCCGGTCTTCACGGACAATCCGTCTCCC 
CCGGCTGTACCGCAGACATTCCAAGTGGCCCACTTACACGCTCCCACCGGCAGCGGCAAGAGCACTAGGG 
TGCCGGCTGCATATGCAGCCCAAGGGTACAAGGTGCTCGTCCTAAATCCGTCCGTCGCCGCCACCTTGGG 

40 TTTTGGGGCGTATATGTCCAAGGCACATGGTATCGACCCCAACCTTAGAACTGGGGTAAGGACCATCACC 
ACAGGTGCCCCTATCACATACTCCACCTATGGCAAGTTCCTTGCCGACGGTGGCGGCTCCGGGGGCGCCT 
ATGACATCATAATGTGTGATGAGTGCCACTCAACTGACTCGACTACCATTTATGGCATCGGCACAGTCCT 
GGACCAAGCGGAGACGGCTGGAGCGCGGCTCGTGGTGCTCTCCACCGCTACGCCTCCGGGATCGGTCACC 
GTGCCACACCTCAATATCGAGGAGGTGGCCCTGTCTAATACTGGAGAGATCCCCTTCTACGGCAAAGCCA 

45 TTCCCATCGAGGCTATCTVAGGGGGGAAGGCATCTCATTTTCTGCCATTCCAAGAAGTUVGTGTGACGTVACT 
CGCCGCAAAGCTGTCAGGCCTCGGACTCAATGCCGTAGCGTATTACCGGGGTCTTGACGTGTCCGTCATA 
CCGACCAGCGGAGACGTTGTTGTCGTGGCGACGGACGCTCTAATGACGGGCTTTACCGGCGACTTTGACT 
CAGTGATCGACTGTAATACGTGTGTCACCCAGACAGTCGATTTCAGCTTGGACCCCACCTTCACCATTGA 
GACGACGACCGTGCCCCAAGACGCAGTGTCGCGCTCGCAGAGGCGAGGCAGGACTGGTAGGGGCAGGGCT 

50 GGCATATACAGGTTTGTGACTCCAGGAGAACGGCCCTCGGGCATGTTCGATTCTTCGGTCCTGTGTGAGT 
GTTATGACGCGGGTTGTGCGTGGTACGi\ACTCACGCCCGCTGAGACCTCGGTTAGGTTGCGGGCGTACCT 
AAACACACCAGGGTTGCCCGTCTGCCAGGACCATCTGGAGTTCTCGGAGGGTGTCTTCACAGGCCTCACC 
CACATAGATGCCCACTTCTTATCCCAGACTAAACAGGCAGGAGAGAACTTCCCCTACTTGGTAGCATACC 
AGGCTACAGTGTGCGCCAGGGCTCAAGCCCCACCTCCATCGTGGGATGAAATGTGGAGGTGTCTCATACG 

65 GCTGAAACCTACGCTGCACGGGCCAACACCCCTGCTGTATAGGTTAGGAGCCGTCCAAAATGAGGTCACC 
CTCACACACCCCATAACCAAATTCATCATGACATGTATGTCGGCTGACCTGGAGGTCGTCACCAGCACCT 
GGGTGCTGGTAGGCGGAGTCCTCGCAGCTCTGGCCGCGTACTGCCTGACAACAGGCAGCGTGGTCATTGT 
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GGGCAGGATCATCCTGTCCGGG7\AGCCGGCTATCATCCCCGATAGGGAAGTTCTCTACCAGGAGTTCGAC 
GAGATGGAGGAGTGTGCCTCACACCTCCCTTACTTCGAACAGGGAATGCAGCTCGCCGAGCAATTCAAAC 
AGAAGGCGCTCGGGTTGCTGCAAACAGCCACCAAGCAGGCGGAGGCTGCTGCTCCCGTGGTGGAGTCCAA 
GTGGCGAGCCCTTGAGACCTTCTGGGCGAAGCACATGTGGAACTTCATTAGTGGGATACAGTACTTGGCA 
5 GGCTTGTCCACTCTGCCTGGGAACCCCGCAATACGATCACCGATGGCATTCACAGCCTCCATCACCAGCC 
CGCTCACCACCCAGCATACCCTCTTGTTTAACATCTTGGGGGGATGGGTGGCTGCCCAACTCGCCCCCCC 
CAGCGCTGCCTCAGCTTTCGTGGGCGCCGGCATCGCTGGAGCCGCTGTTGGCACGATAGGCCTTGGGAAG 
GTGCTTGTGGACATTCTGGCAGGTTATGGAGCAGGGGTGGCGGGCGCACTTGTGGCCTTTAAGATCATGA 
GCGGCGAGATGCCTTCAGCCGAGGACATGGTCAACTTACTCCCTGCCATCCTTTCTCCCGGTGCCCTGGT 

10 CGTCGGGATTGTGTGTGCAGCAATACTGCGTCGGCATGTGGGCCCAGGGGAAGGGGCTGTGCAGTGGATG 
AACCGGCTGATAGCGTTCGCCTCGCGGGGTAACCACGTCTCCCCCAGGCACTATGTGCCAGAGAGCGAGC 
CTGCAGCGCGTGTTACCCAGATCCTTTCCAGCCTCACCATCACTCAGCTGTTGAAGAGACTCCACCAGTG 
GATTAATGAGGACTGCTCTACGCCATGCTCCAGCTCGTGGCTAAGGGAGATTTGGGACTGGATCTGCACG 
GTGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCGATTACCGGGAGTCCCTTTTTTCT 

15 CATGCCAACGCGGGTATAAGGGAGTCTGGCGGGGGGACGGCATCATGCACACCACCTGCCCATGCGGAGC 
ACAGATCACCGGACACGTCAAAAACGGTTCCATGAGGATCGTTGGGCCTAAAACCTGCAGCAACACGTGG 
TACGGGACATTCCCCATCAACGCGTACACCACGGGCCCCTGCACACCCTCCCCGGCGCCAAACTATTCCA 
AGGCATTGTGGAGAGTGGCCGCTGAGGAGTACGTGGAGGTCACGCGGGTGGGAGATTTTCACTACGTGAC 
GGGCATGACCACTGACAACGTGAAGTGTCCATGCCAGGTTCCGGCCCCCGAATTCTTCACGGAGGTGGAT 

20 GGAGTGCGGTTGCACAGGTACGCTCCGGCGTGCAGACCTCTCCTACGGGAGGAGGTCGTATTCCAGGTCG 
GGCTCCACCAGTACCTGGTCGGGTCACAGCTCCCATGCGAGCCCGAACCGGATGTAGCAGTGCTCACTTC 
CATGCTCACTGACCCCTCCCACATTACAGCAGAGACGGCTAAGCGTAGGCTGGCCAGGGGGTCTCCCCCC 
TCCTTGGCCAGCTCTTCAGCTAGCCAGTTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCATCATG 
ACTCCCCGGACGCTGACCTCATTGAGGCCAACCTCTTGTGGCGGCAAGAGATGGGCGGGAACATCACCCG 

25 CGTGGAGTCAGAGAATAAGGTGGTAATCCTGGACTCTTTCGACCCGCTCCGAGCGGAGGATGATGAGGGG 
GAAATATCCGTTCCGGCGGAGATCCTGCGGAAATCCAGGAAATTCCCCCCAGCGCTGCCCATATGGGCGC 
CGCCGGATTACAACCCTCCGCTGCTAGAGTCCTGGAAGGACCCGGACTACGTTCCTCCGGTGGTACACGG 
GTGCCCGTTGCCGCCCACCAAGGCCCCTCCAATACCACCTCCACGGAGGAAGAGGACGGTTGTCCTGACA 
GAATCCACCGTGTCTTCTGCCTTGGCGGAGCTCGCTACTAAGACCTTCGGCAGCTCCGGATCGTCGGCCA 

30 TCGACAGCGGTACGGCGACCGCCCCTCCTGACCAAGCCTCCGGTGACGGCGACAGAGAGTCCGACGTTGA 
GTCGTTCTCCTCCATGCCCCCCCTTGAGGGAGAGCCGGGGGACCCCGATCTCAGCGACGGATCTTGGTCC 
ACCGTGAGCGAGGAGGCTAGTGAGGACGTCGTCTGCTGTTCGATGTCCTACACATGGACAGGCGCCCTGA 
TCACGCCATGCGCTGCGGAGGAAAGCAAGTTGCCCATCAACCCGTTGAGCAATTCTTTGCTACGTCACCA 
CAACATGGTCTATGCTACAACATCCCGCAGCGCAGGCCTGCGGCAGAAGAAGGTCACCTTTGACAGACTG 

35 CAAGTCCTGGACGACCACTACCGGGACGTGCTTAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTA 
AACTTCTATCTGTAGAAGAAGCCTGCAAACTGACGCCCCCACATTCGGCCAAATCCAAATTTGGCTACGG 
GGCGAAGGACGTCCGGAGCCTATCCAGCAGGGCCGTTACCCACATCCGCTCCGTGTGGAAGGACCTGCTG 
GAAGACACTGAAACACCAATTAGCACTACCATCATGGCAAAAAATGAGGTTTTCTGTGTCCAACCAGAGA 
AGGGAGGCCGCAAGCCAGCTCGCCTTATCGTGTTCCCAGATCTGGGAGTTCGTGTATGCGAGAAGATGGC 

40 CCTTTATGACGTGGTCTCCACCCTTCCTCAGGCCGTGATGGGCTCCTCATACGGATTCCAGTACTCTCCT 
AAGCAGCGGGTCGAGTTCCTGGTGAATACCTGGAAATCAAAG7\AATGCCCCATGGGCTTCTCATATGACA 
CCCGCTGTTTTGACTCAACGGTCACTGAGAATGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGA 
CTTGGCCCCCGAAGCCAAACTGGCCATAAAGTCGCTCACAGAGCGGCTCTATATCGGGGGTCCCCTGACT 
AATTCAAAAGGGCAGAACTGCGGTTACCGCCGGTGCCGCGCGAGCGGCGTGCTGACGACTAGCTGCGGTA 

45 ATACCCTCACATGTTACCTGAAAGCCACTGCGGCCTGTCGAGCTGCGAAGCTCCGGGACTGCACGATGCT 
CGTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGCGCGGGAACCC7UVGAGGATGCGGCGAGCCTACGA 
GTCTTCACGGAGGCTATGACTAGGTACTCTGCCCCCCCTGGGGACCCGCCTCAACCGGAATACGACTTGG 
AGTTGATAACATCATGTTCCTCCAATGTGTCGGTCGCACACGATGCATCTGGTAAAAGGGTGTACTACCT 
CACCCGTGACCCTACCACCCCCCTTGCACGGGCTGCGTGGGAGACAGCTAGACACACTCCAGTCAACTCC 

60 TGGCTAGGCAACATCATCATGTATGCGCGCACCTTATGGGCAAGGATGATTCTGATGACTCATTTCTTCT 
CCATCCTTCTAGCTCAGGAGCAACTTGAAAAAACCCTAGATTGTCAGATCTACGGGGCCTGTTACTCCAT 
TGAACCACTTGATCTACCTCAGATCATTGAGCGACTCCATGGTCTTAGCGCATTTTCACTCCATAGTTAC 
TCTCCAGGCGAGATCAATAGGGTGGCTTCATGCCTCAGAAAACTTGGGGTACCACCCTTGCGAGCCTGGA 
GACATCGGGCCAGAAGTGTCCGCGCTAAGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTA 

56 CCTCTTCAACTGGGCGGTGAGGACCAAGCTCAAACTCACTCCAATCCCAGCCGCGTCCCGGTTGGACTTG 
TCCGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGTGCCCGACCCCGCT 
GGTTCATGTTGTGCCTACTCCTACTTTCCGTGGGGGTAGGCATCTACCTGCTCCCCAACCGATGAATGGG 
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GAGCTAT^CACTCCAGGCCAATAGGCCGTTTCTC (SEQ ID NO: 6689) 



gi|329739|gb|L02836.1|HPCCGENOM Hepatitis C China virus complete genome 
5 ATTGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCAGAAAGCGTCTA 
GCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCATAGTGGTCTGC 
GGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCCGCTCAATGCCTG 
GAGATTTGGGCGTGCCCCCGCGAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCCTTGTGGTACTG 
CCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCACGAATCCTAAACC 

10 TCAAAGAAAAACCAAACGTAACACCAACCGCCGCCCACAGGACGTCAAGTTCCCGGGCGGTGGTCAGATC 
GTTGGTGGAGTTTACCTGTTGCCGCGCAGGGGCCCCAGGTTGGGTGTGCGCGCGACTAGGAAGACTTCCG 
AGCGGTCGCAACCTCGTGGAAGGCGACAACCTATCCCCAAGGCTCGCCGACCCGAGGGCAGGACCTGGGC 
TCAGCCCGGGTATCCTTGGCCCCTCTATGGCAATGAGGGCTTTGGGTGGGCAGGATGGCTCCTGTCACCC 
CGCGGCTCCCGGCCTAGTTGGGGCCCCACGGACCCCCGGCGTAGGTCGCGTAATTTGGGTAAGGTCATCG 

15 ATACCCTCACATGCGGCTTCGCCGACCTCATGGGGTACATTCCGCTCGTCGGCGCCCCCTTGGGGGGCGC 
TGCCAGGGCCCTGGCACATGGTGTCCGGGTTCTGGAGGACGGCGTGAACTATGCAACAGGGAATTTGCCC 
GGTTGCTCTTTCTCTATCTTCCTTTTAGCCTTGCTATCCTGTTTGACCACCCCAGCTTCCGCTTACGAAG 
TGCGTAACGTGTCCGGGATATACCATGTCACGAACGACTGCTCCAACTCAAGCATTGTGTATGAGGCAGC 
GGACCTGATCATGCATACCCCTGGGTGCGTGCCCTGCGTTCGGGAAGGCAACTCCTCCCGTTGCTGGGTA 

20 GCGCTCACTCCCACGCTCGCGGCCAGGAACGCCACGATCCCCACTGCGACAGTACGACGGCATGTCGATC 
TGCTCGTTGGGGCGGCTGCTTTCTCTTCCGCCATGTACGTGGGGGATCTCTGCGGATCTGTTTTCCTTGT 
CTCTCAGCTGTTCACCTTCTCGCCTCGCCGGTATGAGACAATACAGGACTGCAATTGCTCAATCTATCCC 
GGCCACGTAACAGGTCACCGCATGGCTTGGGATATGATGATGAACTGGTCGCCTACAACAGCTCTAGTGG 
TGTCGCAGTTACTCCGGATCCCTCAAGCCGTCATGGACATGGTGGTGGGGGCCCACTGGGGAGTCCTGGC 

25 GGGCCTTGCCTACTATGCCATGGTGGGGAATTGGGCTAAGGTTTTGATTGTGATGCTACTCTTCGCCGGC 
GTTGATGGGGATACCTACGCGTCTGGGGGGGCGCAGGGCCGCTCCACCCTCGGGTTCACGTCCCTCTTTA 
CACCTGGGGCCTCTCAGAAGATCCAGCTTATAAATACCAATGGTAGCTGGCATATCAACAGGACTGCCCT 
GAACTGCAATGACTCCCTCAATACTGGGTTTCTTGCCGCGCTGTTCTATACACACAGGTTCAACGCGTCC 
GGATGCGCAGAGCGCATGGCCAGCTGCCGCCCCATTGATACATTCGATCAGGGCTGGGGCCCCATCACTT 

30 ATACTGAGCCTGATAGCTCGGACCAGAGGCCTTATTGCTGGCACTACGCGCCTCGAAAGTGCGGCATCGT 
ACCTGCGTCGGAGGTGTGCGGTCCAGTGTATTGTTTCACCCCAAGCCCTGTCGTCGTGGGGACGACCGAT 
CGTTTCGGTGTCCCCACATATAGCTGGGGGGAGAATGAGACAGACGTGCTGCTCCTCAACAACACGCGGC 
CGCCGCAAGGCAACTGGTTTGGCTGTACATGGATGAATGGCACTGGGTTCACCAAGACGTGCGGGGGGCC 
TCCGTGTAACATCGGGGGGGTCGGCAACAACACTTTGACTTGCCCCACGGATTGCTTTCGGAAGCACCCC 

35 GAGGCTACGTATACAAGGTGTGGTTCGGGGCCTTGGCTGACACCTAGGTGCTTAGTTGACTACCCATACA 
GGCTCTGGCACTACCCCTGCACTGTCAACTTTGCCATCTTCAAAGTTAGGATGTATGTGGGGGGCGTGGA 
GCACAGGCTCGATGCTGCATGCAACTGGACTCGAGGAGAGCGCTGTAACTTGGAGGACAGGGATAGATCA 
GAACTCAGCCCGCTGCTACTGTCTACAACAGAGTGGCAGATACTACCCTGCGCCTTCACCACCCTACCGG 
CTCTGTCCACTGGTTTAATCCATCTCCATCAGAACATCGTGGACGTGCAATACCTGTACGGTATAGGGTC 

40 AGCGGTTGCCTCCTTTGCAATTAAATGGGAGTATGTCTTGTTGCTTTTCCTTCTACTAGCAGACGCGCGC 
GTATGTGCCTGCTTGTGGATGATGCTGCTGATAGCCCAGGCCGAGGCCGCCTTAGAGAACCTGGTGGTCC 
TCAATGCGGCGTCCGTGGCCGACGCGCATGGCATCCTCTCCTTCCTTGTGTTCTTTTGTGCCGCCTGGTA 
CATTAAGGGCAGGCTGGTCCCCGGGGCAGCATACGCTTTCTACGGCGTGTGGCCGCTGCTCCTGCTCCTG 
CTGACATTACCACCACGAGCTTACGCCATGGACCGGGAGATGGCTGCATCGTGCGGAGGCGCGGTTTTTG 

45 TAGGTCTGGTATTCCTGACTTTGTCACCATACTAC7VAGGTGTTCCTCGCTAGGCTCATATGGTGGTTGCA 
ATACTTCCTCACCATAGCCGAGGCGCACCTGCAAGTGTGGATCCCCCCTCTC7\ACATTCGAGGGGGCCGC 
GATGCCATCATCCTCCTCACGTGTGCAATCCACCCAGAGTCAATCTTTGACATCACCAAACTCCTGCTCG 
CCACGCTCGGTCCGCTCCTGGTGCTTCAGGCTGGCATAACTAGAGTGCCGTACTTTGTGCGCGCTCATGG 
GCTCATTCGCGCGTGCATGCTATTGCGGAAAGTTGCTGGGGGTCATTATGTCCAAATGGCCTTCATGAAG 

50 CTGGGCGCACTGACAGGTACGTACGTCTATAACCATCTTACTCCGCTGCAGTATTGGCCACGCGCGGGTT 
TACGAGAACTCGCGGTGGCAGTAGAGCCCGTCATCTTCTCTGACATGGAGACCAAGATTATCACCTGGGG 
GGCAGACACTGCAGCGTGTGGAGACATCATCTTGGGTTTACCCGTCTCCGCCCGAAGGGGAAAGGAGATA 
CTCCTGGGGCCGGCCGATAGTCTTGTVAGGGCAGGGGTGGCGACTCCTTGCGCCCATCACGGCCTACTCCC 
AACAGACGCGGGGCTTACTTGGTTGCATCATCACTAGCCTCACAGGCCGAGACAAGAACCAGGTCGAGGG 

55 GGAGGTTCAAGTGGTCTCCACCGCAACACAATCTTTCCTGGCGACCTGCATCAACGGTGTGTGTTGGACT 
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GTCTATCATGGCGCCGGCTCAAAAACCTTAGCCGGCCCAAAGGGCCCAATCACCCAAATGTACACCT^TG 
TAGACCAGGACCTCGTCGGCTGGCACCGGCCCCCCGGGGCGCGTTCCCTAACACCATGCACCTGCGGCAG 
CTCGGACCTTTACTTGGTCACGAGACATGCTGATGTCATTCCGGTGCGCCGTCGAGGCGACAGTAGGGGG 
AGTTTACTCTCCCCCAGGCCTGTCTCCTACCTGAAGGGCTCGTCGGGGGGCCCACTGCTCTGCCCCTTCG 
5 GGCACGTTGCAGGCATCTTCCGGGCTGCTGTGTGCACCCGGGGGGTTGCGAAGGCGGTGGATTTTATACC 
CGTTGAGACCATGGAAACTACCATGCGGTCCCCGGTCTTCACGGACAACTCATCCCCTCCTGCCGTACCG 
CAGACATTCCAAGTGGCCCATCTACACGCTCCCACTGGCAGCGGCTVAAAGCACCAAGGTGCCGGCTGCAT 
ATGCAGCCCAAGGGTACAAGGTACTTGTCTTGAACCCGTCTGTTGCCGCCACTTTAGGTTTTGGGGCGTA 
TATGTCTAAGGCACATGGTGTCGACCCCAACATTAGAACCGGGGTAAGGACCATCACCACGGGCGCCCCC 

10 ATCACATACTCTACCTATGGCAAGTTCCTTGCTGATGGTGGTTGCTCTGGGGGTGCCTATGACATTATAA 
TATGTGATGAGTGCCATTCAACTGACTCGACTACCATCTTGGGCATCGGCACGGTCCTGGACC7\AGCGGA 
GACGGCTGGAGCGCGGCTTGTCGTGCTCGCCACCGCTACGCCTCCGGGATCGGTCACCGTGCCACATCCA 
AACATCGAGGAGGTGGCCCTGTCCAATACTGGAGAGATCCCCTTCTATGGTAAAGCCATCCCCATCGAAG 
CCATCAGGGGGGGAAGGCATCTCATTTTCTGCCACTCCAAGAAGAAGTGTGACGAGCTTGCTGCAAAGCT 

15 ATCATCGCTCGGGCTCAACGCTGTGGCGTACTACCGGGGGCTTGATGTGTCCGTCATACCATCTAGCGGA 
GACGTCGTTGTCGTGGCAACGGACGCTCTAATGACGGGCTTTACGGGCGACTTTGACTCAGTGATCGACT 
GTAACACATGTGTTACCCAAACAGTCGATTTCAGCTTGGACCCCACCTTCACCATCGAGACAACGACCGT 
GCCCCAAGACGCGGTGTCGCGCTCGCAGCGGCGAGGTAGGACTGGCAGGGGTAGGGAAGGCATCTACAGG 
TTTGTTACTCCAGGAGAACGGCCCTCGGGCATGTTCGACTCCTCAGTCCTGTGTGAGTGCTATGACGCGG 

20 GCTGTGCTTGGTACGAGCTCACGCCGGCTGAGACCACGGTTAGGTTGCGGGCTTACCTAAATACACCAGG 
GTTGCCCGTCTGCCAGGACCATCTGGAGTTCTGGGAGGGCGTCTTCACAGGTCTCACCCATATAGACGCT 
CACTTTCTGTCCCAGACCAAGCAAGCAGGAGACAACTTCCCCTACCTGGTAGCATACCAAGCTACAGTGT 
GTGCCAAGGCTCAGGCCCCACCTCCATCGTGGGATCAAATGTGGAAGTGCCTCACACGGCTAAAGCCTAC 
GCTGCAGGGACCAACACCCCTGCTGTATAGGCTAGGAGCCGTCCAAAATGAGGTCACCCTCACACACCCC 

25 ATAACTAAATACATCATGACATGCATGTCGGCTGACCTGGAGGTCGTCACCAGCACCTGGGTGCTGGTGG 
GCGGAGTCCTTGCAGCTCTGGCCGCGTATTGCCTGACAACGGGCAGCGTGGTCATTGTGGGTAGGATTGT 
CTTGTCCGGAAGTCCGGCTATTGTTCCTGACAGGGAAGTTCTTTACCAAGACTTCGACGAGATGGAAGAG 
TGTGCCTCACACCTCCCTTACATCGAACAGGGAATGCAGCTCGCCGAGCAGTTCAAGCAGAAGGCGCTCG 
GGTTGCTGCAAACAGCCACCAAGCAAGCGGAGGCTGCTGCTCCCGTGGTGGAGTCCAAGTGGCGAGCCCT 

30 CGAGACATTTTGGGAAAAACACATGTGGAATTTCATCAGCGGGATACAGTACTTAGCAGGCTTATCCACT 
CTGCCTGGGAACCCCGCAATGGCATCACTGATGGCATTCACAGCTTCTATCACCAGCCCGCTCACTACCC 
AACACACCCTCCTGTTTAACATCTTGGGTGGATGGGTGGCTGCCCAACTCGCTCCCCCCAGCGCCGCTTC 
GGCCTTTGTGGGCGCCGGCATTGCCGGTGCGGCTGTTGGCAGCATAGGCCTTGGGAAGGTGCTTGTGGAC 
ATCCTGGCGGGTTATGGGGCGGGGGTGGCTGGCGCACTCGTGGCCTTTAAGGTCATGAGTGGCGAAATGC 

35 CCTCCACTGAGGACCTGGTTAATTTACTCCCTGCCATCCTCTCTCCTGGTGCCCTAGTCGTCGGGGTCGT 
GTGCGCAGCAATACTGCGCCGACACGTGGGCCCGGGAGAGGGGGCTGTGCAGTGGATGAACCGGCTGATA 
GCGTTCGCTTCGCGGGGTAACCATGTCTCCCCCACGCACTATGTGCCTGAAAGTGACGCCGCAGCGCGTG 
TTACCCAGATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAT^GACTTCACCAGTGGATTAATGAGGA 
CTGTTCCACACCATGCTCCGGCTCGTGGCTAAGGGATGTTTGGGATTGGATATGCACGGTGTTGACCGAT 

40 TTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCGGTTGCCCGGAGTCCCTTTCCTCTCATGCCAACGCG 
GGTACAAGGGAGTCTGGCGGGGGGACGGTATTATGCAAACCACCTGTCCATGTGGAGCACAGATTACTGG 
ACATGTCAAAAACGGTTCCATGAGAATCGTTGGGCCTAAGACTTGTAGCAACACGTGGCATGGAACATTC 
CCCATCAACGCGTACACCACGGGCCCCTGCACACCCTCCCCGGCGCCGAACTATTCCAGGGCGCTGTGGC 
GGGTGGCTCCTGAGGAGTACGTGGAGGTTACGCGGGTGGGGGATTTCCACTACGTGACGGGCATGACCAC 

45 CGACAACGTGAAATGCCCATGCCAAGTCCCGGCCCCTGAATTCTTCACGGAGGTGGATGGAGTACGGCTG 
CACAGGTACGCTCCGGCGTGCflAACCTCTCCTACGGGAGGAGGTCGTGTTCCAGGTCGGGCTCAACCAAT 
ACCTGGTTGGATCACAGCTCCCATGCGAGCCCGAGCCGGACGTAACAGTGCTCACTTCCATGCTTACCGA 
CCCCTCCCACATCACAGCAGAGACGGCCAAGCGTAGGCTGGCCAGGGGGTCTCCCCCCTCCTTGGCCAGC 
TCTTCAGCTAGCCTU^TTGTCTGCGCCTTCTTTGAAGGCGACATGTACTACCCATCATGACTCCCCGGACG 

50 CCGACCTCATTGAGGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGT^CATCACCCGTGTGGAGTCAGA 
AAATAAGGTAGTGATCCTGGACTCTTTCGACCCGCTTCGGGCGGAGGAGGACGAGAGGGAAGTATCCGTT 
GCGGCGGAGATCCTGCGGAAATCCAGGAAGTTCCCCTCAGCGCTGCCCATATGGGCACGCCCAGACTACA 
ACCCTCCACTGCTAGAGTCCTGGAAGGACCCAGATTATGTCCCTCCGGTGGTACACGGGTGCCCGTTGCC 
GCCTACCACGGCCCCTCCAGTACCACCTCCACGGAGAAAAAGGACGGTCGTCCTAACAGAGTCATCCGTG 

55 TCTTCTGCCTTGGCGGAGCTCGCTACTAAGACCTTCGGCAGCTCTGAATCGTCGGCCGTCGACAGCGGCA 
CGGCGACTGCCCCTCCTGACGAGGCCTCCGGCGGCGGCGACAAAGGATCCGACGTTGAGTCGTACTCCTC 
CATGCCCCCCCTTGAGGGAGAGCCGGGGGACCCCGACCTCAGCGACGGGTCCTGGTCTACCGTGAGTGAG 
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GAGGCCAGTGAGGACGTCGTCTGCTGCTCAATGTCCTATACATGGACAGGCGCCTTGATCACGCCATGTG 
CTGCGGAGGAGAGCAAGCTGCCCATCAACCCGCTGAGCAACTCCTTGCTGCGTCACCACAACATGGTCTA 
TGCTACAACATCCCGCAGTGCAAGCCTACGGCAGAAGAAGGTCGCTTTTGACAGAATGCAAGTCCTGGAC 
GACCACTACCGGGACGTGCTCAAGGAGATGTVAGGCGT^GGCGTCCACAGTTAAGGCTAAACTCCTATCCA 
5 TAGAAGAGGCCTGCAAGCTGACGCCCCCACATTCAGCCAAATCCAT^TTTGGCTATGGGGCAAAAGACGT 
CCGGAACCTATCCAGCAAGGCCGTTAACCACATCCGCTCCGTGTGGAAGGACTTGTTGGAAGACAATGAG 
ACACCAATCAATACCACCATCATGGC7VAAAAATGAGGTTTTCTGCGTCCAACCAGAGAAAGGAGGCCGTA 
AGCCAGCTCGCCTTATCGTATTCCCAGACTTGGGAGTCCGTGTGTGCGAGAAGATGGCCCTTTATGACGT 
GGTCTCCACCCTTCCTCAGCCCGTGATGGGCTCCTCATACGGATTCCAGTACTCTCCTGGGCAGCGGGTC 

10 G7\ATTCCTGCTAAATGCCTGGAAATCAAAGGAAAACCCTATGGGCTTCTCATATGACACCCGCTGTTTTG 
ACTCAACGGTCACTCAGAACGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGACTTGGCCCCCGA 
GGCCAGACGGGCCATATVAGTCGCTCACAGAGCGGCTCTATATCGGGGGTCCCCTGACTAATTCAAAAGGG 
CAGAACTGCGGTTATCGCCGGTGCCGCGC7VAGTGGCGTGCTGACGACCAGCTGCGGT7VATACCCTTACAT 
GTTACTTGAAGGCCTCTGCGGCCTGTCGAGCTGCGAAGCTGCAGGACTGCACGATGCTCGTGAACGGAGA 

15 CGACCTTGTCGTTATCTGTGAAAGCGCGGGAACTCAAGAGGATGCGGCGAGCCTACGAGTCTTCACGGAG 
GCTATGACTAGGTACTCTGCCCCCCCTGGGGACCTGCCCCAACCAGAATACGACTTGGAGCTAATAACAT 
CATGCTCCTCCAATGTGTCAGTCGCCCACGATGCATCTGGCAAAAGGGTGTACTACCTCACCCGTGACCC 
CACCATCCCCCTCGCGCGGGCTGCGTGGGAGACAGCTAGACACACTCCAGTCAACTCCTGGCTAGGCAAC 
ATCATCATGTATGCGCCCACTCTATGGGCAAGGATGATTCTGATGACTCACTTCTTCTCCATCCTTCTAG 

20 CTCAGGAGCAACTTGAGAAAGCCCTGGATTGCCAAATCTACGGGGCCTACTACTCCATTGAGCCACTTGA 
CCTACCTCAGATCATTGAACGACTCCATGGCCTTAGCGCATTTTCACTCCATAGTTACTCTCCAGGTGAG 
ATCAATAGGGTGGCGTCATGTCTCAGGAAACTTGGGGTACCACCCTTGCGAGTCTGGAGACATCGGGCCA 
GAAGCGTCCGCGCTAAGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTACCTCTTCAACTG 
GGCAGTAAAGACCAAGCTTAAACTCACTCCAATCCCGGCTGCGTCCCGGTTGGACTTGTCCGGCTGGTTC 

25 GTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGTGCCCGACCCCGTTGGTTCATGTTGT 
GCCTACTCCTACTTTCTGTAGGGGTAGGCATCTACCTGCTCCCCAACCGATGAACGGGGAGATAAACACT 
CCAGGCCAATAGGCCATCCC (SEQ ID NO: 6690) 



30 gi|15422182|gb|Ay051292.1| Hepatitis C virus from India polyprotein luRNA, 
complete cds 

GCCAGCCCCCTGATGGGGGCGACACTCCACCATAGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCA 
GT^GCGTCTAGCCATGGCGTTAGTATGAGTGTCGTGCAGCCTCCAGGACCCCCCCTCCCGGGAGAGCCA 
TAGTGGTCTGCGGAACCGGTGAGTACACCGGAATTGCCAGGACGACCGGGTCCTTTCTTGGATCAACCCG 

35 CTCAATGCCTGGAGATTTGGGCGTGCCCCCGCAAGACTGCTAGCCGAGTAGTGTTGGGTCGCGAAAGGCC 
TTGTGGTACTGCCTGATAGGGTGCTTGCGAGTGCCCCGGGAGGTCTCGTAGACCGTGCACCATGAGCACG 
T^TCCTAAACCTCAAAGAAAAACCAAACGTAACACCAACCGACGCCCACAGAACGTTAAGTTCCCGGGTG 
GCGGCCAGATCGTTGGCGGAGTTTGCTTGTTGCCGCGCAGGGGTCCCAGAGTGGGTGTGCGCGCGACGAG 
GAAGACTTCCGAGCGGTCACAACCTCGCGGAAGGCGTCAGCCTATTCCCAAGGCCCGCCGACCCGAGGGC 

40 AGGTCCTGGGCGCAGCCCGGGTACCCTTGGCCCCTCTATGGCAACGAGGGCTGTGGGTGGGCAGGATGGC 
TCTTGTCCCCCCGCGGCTCCCGGCCTAGTCGGGGCCCCTCTGACCCCCGGCGCAGGTCACGCAATTTGGG 
TAAGGTCATCGATACCCTCACGTGTGGCTTCGCCGACCTCATGGGGTACATCCCGCTCGTCGGTGCTCCT 
CTAGGGGGCGCTGCTAGGGCTCTGGCACATGGTGTTAGGGTTCTAGAAGACGGCGTAAATTACGCAACAG 
GGAACCTTCCTGGTTGCTCTTTTTCTATCTTCTTGCTTGCTCTTCTCTCCTGCTTGACAGTCCCTGCTTC 

45 GGCCGTCGAAGTGCGCAACTCTTCGGGGATCTACCATGTCACCAATGATTGCCCCAATGCGTCTGTTGTG 
TACGAGACAGATAGCTTGATCATACATCTGCCCGGGTGTGTGCCCTGCGTACGCGAGGGCAACGCTTCGA 
GGTGCTGGGTCTCCCTTAGTCCTACTGTTGCCGCTAAGGATCCGGGCGTCCCCGTCAACGAGATTCGGCG 
TCACGTCGACCTGATTGTCGGGGCCGCTGCATTCTGTTCGGCTATGTATGTAGGGGACTTATGCGGTTCC 
ATCTTCCTCGTTGGCCAGCTTTTCACCCTCTCCCCTAGGCGCCACTGGACAACACAAGACTGTAATTGCT 

50 CCATCTACCCAGGACATGTGACAGGCCATCGT^TGGCTTGGGACATGATGATGAATTGGTCACCTACTGG 
CGCTTTGGTGGTAGCGCAGCTACTCCGGATCCCACAAGCCGTCTTGGATATGATAGCCGGTGCCCACTGG 
GGTGTCCTAGCGGGCCCGGCATACTACTCCATGGTGGGGAACTGGGCTAAGGTTTTGGTTGTGCTACTGC 
TCTTCGCTGGCGTCGATGCAACCACCCAAGTCACAGGTGGCACCGCGGGCCGTAATGCATATAGATTGGC 
TAGCCTCTTCTCCACCGGCCCCAGCCAAAATATCCAGCTCATAAACTCCAATGGCAGCTGGCACATTAAC 

55 AGGACTGCCCTGAATTGCAATGACAGCCTGCACACCGGCTGGGTAGCAGCGCTGTTCTACTCCCACAAGT 
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TCAACTCTTCGGGGCGTCCTGAGAGGATGGCTAGTTGTCGGCCTCTTACCGCCTTCGACCAAGGGTGGGG 
GCCCATCACTTACGGGGGGAAAGCTAGTAACGACCAGCGGCCGTATTGCTGGCACTATGCCCCACGCCCG 
TGCGGTATCGTGCCGGCGAAAGAGGTTTGCGGGCCTGTATACTGTTTCACACCCAGTCCCGTGGTAGTGG 
GGACGACGGACAAGTACGGCGTTCCTACCTACACATGGGGCGAGAATGAGACGGATGTACTGCTCCTTAA 
5 CAACTCTAGGCCGCCAATAGGGAATTGGTTCGGGTGTACGTGGATGAATTCCACTGGTTTCACCAAGACG 
TGCGGGGCTCCTGCCTGTAACGTCGGCGGGAGCGAGACCAACACCCTGTCGTGCCCCACAGATTGCTTCC 
GCAGACATCCGGACGCAACATACGCTAAGTGCGGCTCTGGCCCTTGGCTTAACCCTCGATGCATGGTGGA 
CTACCCTTACAGGCTCTGGCACTATCCCTGCACAGTCAATTACACCATATTCAAGATCAGGATGTTCGTG 
GGCGGGATTGAGCACAGGCTCACCGCCGCGTGCAACTGGACGCGGGGAGAGCGCTGCGACTTGGACGACA 

10 GGGATCGTGCCGAGTTGAGCCCGCTGTTGCTGTCCACCACGCAATGGCAGGTCCTCCCCTGCTCATTCAC 
AACGCTGCCCGCCCTGTCAACTGGCCTAATACATCTCCACCAGAACATCGTGGACGTGCAGTACCTCTAC 
GGGTTGAGCTCGGTAGTTACATCCTGGGCCATAAGGTGGGAGTATGTCGTGCTCCTTTTCTTGCTGTTAG 
CAGATGCCCGCATTTGTGCCTGCCTTTGGATGATGCTTCTCATATCCCAGGTAGAGGCGGCGCTGGAGAA 
CCTGATAGTCCTCAACGCTGCTTCCCTGGCTGGGACACACGGCATCGTCCCTTTCTTCATCTTTTTTTGT 

15 GCAGCCTGGTATCTGAAAGGCAAGTGGGCCCCTGGACTCGTCTACTCCGTCTACGGAATGTGGCCGCTGC 
TCCTGCTTCTCCTGGCGTTGCCCCAACGGGCGTACGCCTTGGATCAGGAGTTGGCCGCGTCGTGTGGGGC 
CGTGGTCTTCATCAGCCTAGCGGTACTTACCCTGTCGCCGTACTACAAACAGTACATGGCCCGCGGCATC 
TGGTGGCTGCAGTACATGCTGACCAGAGCGGAGGCGCTCCTGCACGTCTGGGTCCCCTCGCTCAACGCCC 
GGGGAGGGCGTGATGGTGCCATACTGCTCATGTGTGTGCTCCACCCGCACTTGCTCTTTGACATCACCAA 

20 AATCATGCTGGCCATTCTCGGGCCCCTGTGGATCTTGCAGGCCAGTCTGCTCAGGGTGCCGTACTTCGTG 
CGCGCCCACGGTCTCATTAGGCTCTGCATGCTGGTGCGCAAAACAGCGGGCGGTCACTATGTGCAGATGG 
CTCTGTTGAAGCTGGGGGCACTTACTGGCACTTACATTTACAACCACCTTTCCCCACTCCAAGACTGGGC 
TCATGGCAGCTTGCGTGATCTAGCGGTGGCCACCGAGCCCGTCATCTTCTCCCGGATGGAGATCAAGACT 
ATCACCTGGGGGGCAGACACCGCGGCCTGTGGAGACATCATCAACGGGCTGCCTGTTTCTGCTCGGAGGG 

26 GGAGAGAGGTGTTGTTGGGACCAGCCGATGCCCTGACTGACAAGGGATGGAGGCTTTTAGCCCCCATCAC 
AGCTTACGCCCAACAGACACGAGGTCTCTTGGGCTGTATTGTCACCAGCCTCACCGGTCGGGACAAAAAT 
CAAGTGGAGGGGGAAATCCAGATTGTGTCTACCGCAACCCAGACGTTCTTGGCCACTTGCATCAACGGAG 
CTTGCTGGACTGTTTATCATGGGGCCGGATCGAGGACCATCGCTTCGGCGTCGGGTCCTGTGGTCCGGAT 
GTACACCAATGTGGACCAGGATTTGGTGGGCTGGCCAGCGCCTCAGGGAGCGCGCTCCCTGACGCCGTGC 

30 ACGTGCGGTGCCTCGGATCTGTACTTGGTCACGAGGCACGCGGATGTCATCCCAGTGCGGCGTCGAGGCG 
ATAACAGGGGAAGCTTGCTTTCTCCCCGGCCCATCTCATACCTAAAAGGATCCTCGGGAGGCCCTCTGCT 
CTGCCCCATGGGACATGTCGCGGGCATTTTTAGGGCCGCGGTGTGCACCCGTGGGGTTGCTVAAGGCGGTC 
GACTTTGTGCCCGTTGAGTCCTTAGAGACCACCATGAGGTCCCCAGTGTTTACTGACAATTCCAGCCCTC 
CAACAGTGCCCCAGAGTTACCAGGTGGCACATCTACATGCACCCACTGGGAGTGGCAAGAGCACGAAGGT 

35 GCCGGCCGCTTACGCAGCTCAAGGGTACAAGGTACTTGTGCTGAACCCGTCTGTTGCTGCCACCTTAGGG 
TTCGGTGCTTATATGTCAAAGGCCCATGGGATTGACCCAAACGTCAGGACCGGCGTGAGGACCATTACCA 
CAGGCTCCCCCATCACCTACTCCACCTACGGGAAATTTTTGGCTGATGGCGGATGCCCAGGAGGTGCGTA 
CGACATCATAATATGTGACGAATGTCACTCAGTGGACGCCACCTCGATTCTGGGCATAGGGACCGTCTTG 
GACCAAGCGGAGACGGCGGGGGTTAGGCTCACTGTCCTTGCCACCGCTACACCACCTGGCTTGGTCACCG 

40 TGCCACATTCCAACATCGAGGAAGTTGCACTGTCCGCTGACGGGGAGAAACCATTTTATGGTAAGGCCAT 
CCCCCTAAACTACATCAAGGGGGGGAGGCATCTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAGCTC 
GCTGCAAAGCTGGTCGGTCTGGGCGTCAACGCGGTGGCCTTTTACCGTGGCCTCGACGTATCTGTCATTC 
CAACTACAGGAGACGTCGTTGTTGTAGCGACCGACGCCTTGATGACTGGCTTCACCGGCGATTTCGACTC 
TGTGATAGACTGCAACACCTGTGTCGTCCAGACAGTCGACTTCAGCCTAGACCCTATATTCTCTATTGAG 

45 ACTTCCACCGTGCCCCAGGACGCCGTGTCCCGCTCCCAACGGAGGGGTAGGACCGGTCGAGGGAAGCATG 
GTATTTACAGATATGTGTCACCCGGGGAGCGGCCGTCTGGCATGTTCGACTCCGTGGTCCTCTGTGAGTG 
CTATGACGCGGGTTGTGCTTGGTACGAGCTTACACCCGCCGAGACCACAGTCAGGCTACGGGCATACCTT 
AACACCCCAGGATTGCCCGTGTGCCAGGACCACTTGGAGTTCTGGGAGAGTGTCTTCACCGGCCTCACCC 
ACATAGATGCCCACTTCCTGTCCCAGACGAAACAGAGTGGGGAGAACTTCCCCTACCTAGTCGCATACCA 

50 AGCCACCGTGTGCGCTAGAGCTAGAGCTCCTCCCCCGTCATGGGACCAAATGTGGAAGTGCCTGATACGG 
CTCAAGCCCACCCTCACTGGGGCTACCCCATTACTATACAGACTGGGTAGTGTACAGAATGAGATCACCT 
TAACACACCCAATCACCCAATACATCATGGCTTGCATGTCGGCGGACCTGGAGGTCGTCACTAGCACGTG 
GGTGTTGGTGGGCGGCGTCCTAGCCGCTTTGGCCGCTTACTGCCTGTCCACAGGCAGCGTGGTCATAGTG 
GGCAGGATAATCCTAGGTGGGAAGCCGGCAGTCATACCTGACAGGGAGGTTCTCTACCGAGAGTTTGATG 

55 AGATGGAGGAGTGCGCCGCCCACGTCCCCTACCTCGAGCAGGGGATGCATTTGGCTGGACAGTTCAAGCA 
GTUVAGCTCTCGGGTTGCTCCAGACAGCATCCAAGCAAGCGGAGACGATCACTCCCACTGTCCGCACCAAC 
TGGCAGAAACTCGAGTCCTTCTGGGCTAAGCACATGTGGAACTTCGTTAGCGGGATACAATACCTGGCGG 
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GCCTGTCAACGCTGCCCGGGAACCCCGCTATAGCGTCGCTGATGTCGTTTACGGCCGCGGTGACGAGTCC 
ACTAACCACCCAGCAAACCCTCTTCTTTAACATCTTAGGGGGGTGGGTGGCGGCCCAGCTTGCTTCCCCA 
GCTGCCGCTACTGCTTTTGTCGGTGCTGGTATTACTGGCGCCGTTGTTGGCAGTGTGGGCCTAGGGAAGG 
TCCTAGTGGACATTATTGCTGGCTACGGGGCTGGTGTGGCGGGGGCCCTCGTGGCTTTCAAAATCATGAG 
5 CGGGGAGACCCCCACCACCGAGGATCTAGTCAACCTTCTGCCTGCCATCCTATCGCCAGGAGCTCTCGTT 
GTCGGCGTGGTGTGCGCAGCAATACTACGCCGGCACGTGGGCCCTGGCGAGGGCGCCGTGCAGTGGATGA 
ACCGGCTGATAGCGTTTGCTTCTCGGGGTAACCACGTCTCCCCTACACACTACGTGCCGGAGAGCGACGC 
GTCGGCTCGTGTCACACAAATTCTCACCAGCCTCACTGTTACTCAGCTTCTGAAAAGGCTCCACGTGTGG 
ATAAGCTCGGATTGCATCGCCCCGTGTGCTAGTTCTTGGCTTAAAGATGTCTGGGACTGGATATGCGAGG 

10 TGCTGAGCGACTTC7\AGAATTGGCTGAAGGCCAAACTTGTACCACAACTGCCCGGGATCCCATTCGTATC 
CTGCCAACGCGGGTACCGTGGGGTCTGGCGGGGCGAGGGCATCGTGCACACTCGTTGCCCGTGTGGGGCC 
AATATAACTGGACATGTCAAGAACGGTTCGATGAGAATCGTCGGGCCTAAGACTTGCAGCAACACCTGGC 
GTGGGTCGTTCCCCATTAACGCTTACACTACAGGCCCGTGCACGCCCTCCCCGGCGCCGAACTATACGTT 
CGCGCTATGGAGGGTGTCTGCAGAGGAGTATGTGGAGGTAAGGCGGCTGGGGGACTTCCATTACGTCACG 

15 GGGGTGACCACTGATAAACTCAAGTGTCCATGCCAGGTCCCCTCACCCGAGTTCTTCACAGAGGTGGACG 
GGGTGCGCCTGCATAGGTACGCCCCCCCCTGCAAACCCCTGCTGCGAGAAGAGGTGACGTTTAGCATCGG 
GCTCAATGAATACTTGGTGGGGTCCCAGTTGCCCTGCGAGCCCGAGCCAGACGTAGCTGTACTGACATCA 
ATGCTTACAGACCCCTCCCACATCACTGCAGAGACGGCAGCGCGTAGGCTGAAGCGGGGGTCTCCCCCCT 
CCCTGGCCAGCTCTTCCGCCAGCCAGCTGTCCGCGCCGTCACTGAAGGCAACATGCACCACTCACCACGA 

20 CTCTCCAGACGCTGACCTCATAGTUIGCCT^CCTCCTGTGGAGACAGGAGATGGGGGGGAACATCACTAGG 
GTGGAGTCGGAGAACAAGATTGTCGTTCTGGATTCTTTCGACCCGCTCGTAGCGGAGGAGGATGATCGGG 
AGATCTCTATTCCAGCTGAGATTCTGCGGAAGTTCAAGCAGTTTCCTCCCGCTATGCCCATATGGGCACG 
GCCAGATTATAATCCTCCCCTTGTGGAACCGTGGAAGCGCCCGGACTATGAGCCACCCTTAGTCCACGGG 
TGCCCCCTACCACCTCCCAAGCCAACTCCGGTGCCGCCACCCCGGAGAAAGAGGACGGTGGTGCTGGACG 

25 AGTCTACAGTATCATCTGCTCTGGCTGAGCTTGCCACTAAGACCTTCGGCAGCTCTACAACCTCAGGCGT 
GACAAGTGGTGAAGCGACTGAATCGTCCCCGGCGCCCTCCTGCGGCGGTGAGCTGGACTCCGAAGCTGAA 
TCTTACTCCTCCATGCCCCCTCTCGAGGGGGAGCCGGGGGACCCCGATCTCAGCGACGGGTCTTGGTCTA 
CCGTGAGCAGTGATGGTGGCACGGAAGACGTTGTGTGCTGCTCGATGTCTTACTCGTGGACGGGCGCTTT 
AATCACGCCCTGTGCCTCAGAGGAAGCCAAGCTCCCTATCAACGCATTGAGCAACTCGCTGCTGCGCCAC 

30 CACAACTTGGTGTATTCCACCACCTCTCGCAGCGCTGGCCAGAGACAGAAAAAAGTCACATTTGACAGAG 
TGCAAGTCCTGGACGACCATTACCGGGACGTGCTCAAGGAGGCTAAGGCCAAGGCATCCACGGTGAAGGC 
TAGACTGCTATCCGTTGAGGAAGCGTGTAGCCTGACGCCCCCACACTCCGCCAGATCAAAATTTGGCTAT 
GGGGCG7VAGGATGTCCGAAGCCATTCCAGTAAGGCTATACGCCACATCAACTCCGTGTGGCAGGACCTTC 
TGGAGGACAATACAACACCCATAGACACTACCATCATGGCAAAGAATGAGGTCTTCTGTGTGAAGCCCGA 

36 AAAGGGGGGCCGCAAGCCCGCTCGTCTTATCGTGTACCCCGACCTGGGAGTGCGCGTATGCGAGAAGAGG 
GCTTTGTATGACGTAGTCAAACAGCTCCCCATTGCCGTGATGGGAGCCTCCTACGGGTTCCAGTACTCAC 
CAGCGCAGCGGGTCGACTTCCTGCTTAAAGCGTGGAAATCTAAGAAAGTCCCCATGGGGTTTTCCTATGA 
CACCCGTTGCTTTGACTCAACAGTCACTGAGGCTGATATCCGTACGGAGGAAGACCTCTACCAATCTTGT 
GACCTGGCCCCTGAGGCTCGCATAGCCATAAGGTCCCTCACAGAGAGGCTTTACATCGGGGGCCCACTCA 

40 CCAATTCTAAGGGACAAAACTGCGGCTATCGGCGATGCCGCGCAAGCGGCGTGCTGACCACTAGCTGCGG 
TAACACCAT/IACCTGCTTCCTCAAAGCCAGTGCAGCCTGTCGAGCTGCGAAGCTCCAGGACTGCACCATG 
CTCGTGTGCGGCGACGACCTCGTCGTTATCTGTGAGAGCGCCGGTGTCCAGGAGGACGCTGCGAGCCTGA 
GAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCGGGAGACCCGCCTCAACCAGAATACGACTT 
GGAGCTTATAACATCCTGCTCCTCCAATGTGTCGGTCGCGCGCGACGGCGCTGGCAAAAGGGTCTATTAT 

45 CTGACCCGTGACCCTGAGACTCCCCTCGCGCGTGCCGCTTGGGAGACAGCAAGACACACTCCAGTGAACT 
CCTGGCTAGGCAACATCATCATGTTTGCCCCCACTCTGTGGGTACGGATGGTCCTCATGACCCATTTTTT 
CTCCATACTCATAGCTCAGGAGCACCTTGGAAAGGCTCTAGATTGTGAAATCTATGGAGCCGTACACTCC 
GTCCAACCGTTGGACTTACCTGAAATCATCCAAAGACTCCACAGCCTCAGCGCGTTTTCGCTCCACAGTT 
ACTCTCCAGGTGAAATCAATAGGGTGGCTGCATGCCTCAGGAAGCTTGGGGTTCCGCCCTTGCGAGCTTG 

50 GAGACACCGGGCCCGGAGCGTTCGCGCCACACTCCTATCCCAGGGGGGGAAAGCCGCTATATGCGGTAAG 
TACCTCTTCAACTGGGCGGTGAAT^ACCAAACTCAAACTCACTCCATTACCGTCCATGTCTCAGTTGGACT 
TGTCCAACTGGTTCACGGGCGGTTACAGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCG 
TTTGTTCCTCTGGTGCCTACTCCTACTTTCAGTAGGGGTAGGCATCTATCTCCTTCCCAACCGATAGACG 
GNTGGGCAACCACTCCGGGTCTTTAGGCCCTATTTAAACACTCCAGGCCTTTAGGCCCCGT 

55 (SEQ ID NO: 6691) 
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gi|23510419|ref |NM_000043.3| Homo sapiens tumor necrosis factor receptor 

superfamily, member 6 (TNFRSF6), transcript variant 1, mRNA 

CCTACCCGCGCGCAGGCCAAGTTGCTGAATCAATGGAGCCCTCCCCAACCCGGGCGTTCCCCAGCGAGGC 

TTCCTTCCCATCCTCCTGACCACCGGGGCTTTTCGTGAGCTCGTCTCTGATCTCGCGCAAGAGTGACACA 

CAGGTGTTCAAAGACGCTTCTGGGGAGTGAGGGAAGCGGTTTACGAGTGACTTGGCTGGAGCCTCAGGGG 

CGGGCACTGGCACGGAACACACCCTGAGGCCAGCCCTGGCTGCCCAGGCGGAGCTGCCTCTTCTCCCGCG 

GGTTGGTGGACCCGCTCAGTACGGAGTTGGGGAAGCTCTTTCACTTCGGAGGATTGCTCAACAACCATGC 

TGGGCATCTGGACCCTCCTACCTCTGGTTCTTACGTCTGTTGCTAGATTATCGTCCAAAAGTGTTAATGC 

CCAAGTGACTGACATCAACTCCAAGGGATTGGAATTGAGGAAGACTGTTACTACAGTTGAGACTCAGAAC 

TTGGAAGGCCTGCATCATGATGGCCAATTCTGCCATAAGCCCTGTCCTCCAGGTGAAAGGAAAGCTAGGG 

ACTGCACAGTCAATGGGGATGAACCAGACTGCGTGCCCTGCCAAGAAGGGAAGGAGTACACAGACAAAGC 

CCATTTTTCTTCCA71ATGCAGAAGATGTAGATTGTGTGATGAAGGACATGGCTTAGAAGTGG7VAATAAAC 

TGCACCCGGACCCAGAATACCAAGTGCAGATGTAAACCAAACTTTTTTTGTAACTCTACTGTATGTGAAC 

ACTGTGACCCTTGCACCAAATGTGAACATGGAATCATCAAGGAATGCACACTCACCAGCAACACCAAGTG 

CAAAGAGGAAGGATCCAGATCTAACTTGGGGTGGCTTTGTCTTCTTCTTTTGCCAATTCCACTAATTGTT 

TGGGTGAAGAGAAAGGAAGTACAGAAAACATGCAGAAAGCACAGAAAGGAAAACCAAGGTTCTCATGAAT 

CTCCAACCTTAAATCCTGAAACAGTGGCAATAAATTTATCTGATGTTGACTTGAGT7VAATATATCACCAC 

TATTGCTGGAGTCATGACACTAAGTCAAGTTAAAGGCTTTGTTCGAAAGAATGGTGTCAATGAAGCCAAA 

ATAGATGAGATCAAGAATGACAATGTCCAAGACACAGCAGAACAGAAAGTTCAACTGCTTCGTAATTGGC 

ATCAACTTCATGGAAAGAAAGAAGCGTATGACACATTGATTAAAGATCTCT^AAAAAGCCAATCTTTGTAC 

TCTTGCAGAGAAAATTCAGACTATCATCCTCAAGGACATTACTAGTGACTCAGAAAATTCAAACTTCAGA 

AATGAAATCCAAAGCTTGGTCTAGAGTGAAAAACAACAAATTCAGTTCTGAGTATATGCAATTAGTGTTT 

GAAAAGATTCTTAATAGCTGGCTGTAAATACTGCTTGGTTTTTTACTGGGTACATTTTATCATTTATTAG 

CGCTGAAGAGCCAACATATTTGTAGATTTTTAATATCTCATGATTCTGCCTCCAAGGATGTTTAAAATCT 

AGTTGGGAAAACAAACTTCATCAAGAGTAAATGCAGTGGCATGCT7\AGTACCCAAATAGGAGTGTATGCA 

GAGGATGAAAGATTAAGATTATGCTCTGGCATCTAACATATGATTCTGTAGTATGAATGTAATCAGTGTA 

TGTTAGTACAAATGTCTATCCACAGGCTAACCCCACTCTATGAATCAATAGAAG7VAGCTATGACCTTTTG 

CTGAAATATCAGTTACTGAACAGGCAGGCCACTTTGCCTCTAAATTACCTCTGATAATTCTAGAGATTTT 

ACCATATTTCTAAACTTTGTTTATAACTCTGAGAAGATCATATTTATGTAAAGTATATGTATTTGAGTGC 

AGAATTTAAATAAGGCTCTACCTCAAAGACCTTTGCACAGTTTATTGGTGTCATATTATACAATATTTCA 

ATTGTGAATTCACATAGAAAACATTAAATTATAATGTTTGACTATTATATATGTGTATGCATTTTACTGG 

CTCAAAACTACCTACTTCTTTCTCAGGCATCAAAAGCATTTTGAGCAGGAGAGTATTACTAGAGCTTTGC 

CACCTCTCCATTTTTGCCTTGGTGCTCATCTTAATGGCCTAATGCACCCCCAAACATGGAAATATCACCA 

AAAAATACTTAATAGTCCACCAAAAGGCAAGACTGCCCTTAGAAATTCTAGCCTGGTTTGGAGATACTAA 

CTGCTCTCAGAGAAAGTAGCTTTGTGACATGTCATGAACCCATGTTTGCAATCAAAGATGATAAAATAGA 

TTCTTATTTTTCCCCCACCCCCGAAAATGTTCAATAATGTCCCATGTAAAACCTGCTACAAATGGCAGCT 

TATACATAGCAATGGTAAAATCATCATCTGGATTTAGGAATTGCTCTTGTCATACCCCCAAGTTTCTAAG 

ATTTAAGATTCTCCTTACTACTATCCTACGTTTAAATATCTTTGTW^GTTTGTATTAAATGTGAATTTTA 

AGAT^TAATATTTATATTTCTGTAAATGTAAACTGTGAAGATAGTTATAAACTGAAGCAGATACCTGGAA 

CCACCTAAAGAACTTCCATTTATGGAGGATTTTTTTGCCCCTTGTGTTTGGAATTATATVAATATAGGTAA 

AAGTACGTAATTAAATAATGTTTTTGGTAAAAAAAA7VAAAAAAAAAAAAAAAAAAA7\AAAAAAAAAA7^ 

AAAAAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 6692) 



gi I 35910 1 emb 1X12387.1 1 HSRCYP3 Human mRNA for cytochrome P-450 (cyp3 locus 
GAATTCCCAAAGAGCAACACAGAGCTGAAAGGAAGACTCAGAGGAGAGAGATAAGTAAGGAAAGTAGTGA 

TGGCTCTCATCCCAGACTTGGCCATGGAAACCXGGCTTCTCCTGGCTGTCAGCCTGGTGCTCCTCTATCT 

ATATGGAACCCATTCACATGGACTTTTTAAGAAGCTTGGAATTCCAGGGCCCACACCTCTGCCTTTTTTG 

GGAAATATTTTGTCCTACCATAAGGGCTTTTGTATGTTTGACATGGAATGTCATAAAAAGTATGGAAAAG 

TGTGGGGCTTTTATGATGGTCAACAGCCTGTGCTGGCTATCACAGATCCTGACATGATCAAAACAGTGCT 

AGTGAAAGTU^TGTTATTCTGTCTTCACAAACCGGAGGCCTTTTGGTCCAGTGGGATTTATGAAAAGTGCC 

ATCTCTATAGCTGAGGATGAAGAATGGAAGAGATTACGATCATTGCTGTCTCCAACCTTCACCAGTGGAA 

AACTCAAGGAGATGGTCCCTATCATTGCCCAGTATGGAGATGTGTTGGTGAGAAATCTGAGGCGGGAAGC 

AGAGACAGGCAAGCCTGTCACCTTGAAAGACGTCTTTGGGGCCTACAGCATGGATGTGATCACTAGCACA 

TCATTTGGAGTGAACATCGACTCTCTCAACAATCCACAAGACCCCTTTGTGGAAAACACCAAGAAGCTTT 
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TAAGATTTGATTTTTTGGATCCATTCTTTCTCTCAATAACAGTCTTTCCATTCCTCATCCCAATTCTTGA 
AGTAT TAAAT ATCTGTGTGTTTCCAAG AG AAGT T ACAAATTTT TTAAGAAAATC TGTAAAAAG G AT GAAA 
GA7UVGTCGCCTCGAAGATACACAAAAGCACCGAGTGGATTTCCTTCAGCTGATGATTGACTCTCAGAATT 
CA7V7\AGA7UVCTGAGTCCCACAAAGCTCTGTCCGATCTGGAGCTCGTGGCCCAATCAATTATCTTTATTTT 
5 TGCTGGCTATGAAACCACGAGCAGTGTTCTCTCCTTCATTATGTATGAACTGGCCACTCACCCTGATGTC 
CAGCAGAAACTGCAGGAGGAAATTGATGCAGTTTTACCCAATAAGGCACCACCCACCTATGATACTGTGC 
TACAGATGGAGTATCTTGACATGGTGGTGAATGAAACGCTCAGATTATTCCCAATTGCTATGAGACTTGA 
GAGGGTCTGCAAAAAAGATGTTGAGATCAATGGGATGTTCATTCCCAAAGGGTGGGTGGTGATGATTCCA 
AGCTATGCTCTTCACCGTGACCCAAAGTACTGGACAGAGCCTGAGAAGTTCCTCCCTGAAAGATTCAGCA 

10 AGAAGAACAAGGACAACATAGATCCTTACATATACACACCCTTTGGTUVGTGGACCCAGAAACTGCATTGG 
CATGAGGTTTGCTCTCATGAACATGAAACTTGCTCTAATCAGAGTCCTTCAGAACTTCTCCTTCAAACCT 
TGTi^AAGAAACACAGATCCCCCTGAAATTAAGCTTAGGAGGACTTCTTCAACCAGAAAAACCCGTTGTTC 
TAAAGGTTGAGTCAAGGGATGGCACCGTAAGTGGAGCCTGAATTTTCCTAAGGACTTCTGCTTTGCTCTT 
CAAGAAATCTGTGCCTGAGAACACCAGAGACCTCAAATTACTTTGTGAATAGAACTCTGAAATGAAGATG 

15 GGCTTCATCCAATGGACTGCATAAATAACCGGGGATTCTGTACATGCATTGAGCTCTCTCATTGTCTGTG 
TAGAGTGTTATACTTGGGAATATAAAGGAGGTGACCAAATCAGTGTGAGGAGGTAGATTTGGCTCCTCTG 
CTTCTCACGGGACTATTTCCACCACCCCCAGTTAGCACCATTAACTCCTCCTGAGCTCTGATAAGAGAAT 
CAACATTTCTCAATAATTTCCTCCAC7\AATTATT7VATGAAAATAAGAATTATTTTGATGGCTCTAACAAT 
GACATTTATATCACATGTTTTCTCTGGAGTATTCTATAGTTTTATGTTAAATCAATAAAGACCACTTTAC 

20 AAAAGTATTATCAGATGCTTTCCTGCACATTAAGGAGAATCTATAGAACTGAATGAGAACCAACAAGTAA 
ATATTTTTGGTCATTGTAATCACTGTTGGCGTGGGGCCTTTGTCAGAACTAGAATTTGATTATTAACATA 
GGTGAAAGTTAATCCACTGTGACTTTGCCCATTGTTTAGAAAGAATATTCATAGTTTAATTATGCCTTTT 
TTGATCAGGCACATGGCTCACGCCTGTAATCCTAGCAGTTTGGGAGGCTGAGCCGGGTGGATCGCCTGAG 
GTCAGGAGTTCAAGACAAGCCTGGCCTACATGGTGAAACCCCATCTCTACTAA7\AATACACAAATTAGCT 

25 AGGCATGGTGGACTCGCCTGTAATCTCACTACACAGGAGGCTGAGGCAGGAGAATCACTTGAACCTGGGA 
GGCGGATGTTGAAGTGAGCTGAGATTGCACCACTGCACTCCAGTCTGGGTGAGAGTGAGACTCAGTCTTA 
AAAAAATATGCCTTTTTGAAGCACGTACATTTTGTAACAAAGAACTGAAGCTCTTATTATATTATTAGTT 
TTGATTTAATGTTTTCAGCCCATCTCCTTTCATATTTCTGGGAGACAGAAT^ACATGTTTCCCTACACCTC 
TTGCTTCCATCCTCAACACCCAACTGTCTCGATGCAATGAACACTTAATAAAAAACAGTCGATTGGTCAA 

30 AAAAAAAAAAAAAAAAAAAAAAAGAATTC (SEQ ID NO: 6693) 



gi|3395491gb|M19154.1|HUMTGFB2A Human transforming growth f actor-beta-2 
mRNA, complete cds 

35 GCCCCTCCCGTCAGTTCGCCAGCTGCCAGCCCCGGGACCTTTTCATCTCTTCCCTTTTGGCCGGAGGAGC 
CGAGTTCAGATCCGCCACTCCGCACCCGAGACTGACACACTGAACTCCACTTCCTCCTCTTAAATTTATT 
TCTACTTAATAGCCACTCGTCTCTTTTTTTCCCCATCTCATTGCTCCAAGAATTTTTTTCTTCTTACTCG 
CCAAAGTCAGGGTTCCCTCTGCCCGTCCCGTATTAATATTTCCACTTTTGG7U\CTACTGGCCTTTTCTTT 
TTAAAGGAATTCAAGCAGGATACGTTTTTCTGTTGGGCATTGACTAGATTGTTTGCAAAAGTTTCGCATC 

40 AAAAACAACAACAACAAAAAACCAAACAACTCTCCTTGATCTATACTTTGAGAATTGTTGATTTCTTTTT 
TTTATTCTGACTTTTAAAAACAACTTTTTTTTCCACTTTTTTAAAAAATGCACTACTGTGTGCTGAGCGC 
TTTTCTGATCCTGCATCTGGTCACGGTCGCGCTCAGCCTGTCTACCTGCAGCACACTCGATATGGACCAG 
TTCATGCGCAAGAGGATCGAGGCGATCCGCGGGCAGATCCTGAGCAAGCTGAAGCTCACCAGTCCCCCAG 
AAGACTATCCTGAGCCCGAGGAAGTCCCCCCGGAGGTGATTTCCATCTACAACAGCACCAGGGACTTGCT 

45 CCAGGAGAAGGCGAGCCGGAGGGCGGCCGCCTGCGAGCGCGAGAGGAGCGACGAAGAGTACTACGCCAAG 
GAGGTTTACAAAATAGACATGCCGCCCTTCTTCCCCTCCGAAACTGTCTGCCCAGTTGTTACAACACCCT 
CTGGCTCAGTGGGCAGCTTGTGCTCCAGACAGTCCCAGGTGCTCTGTGGGTACCTTGATGCCATCCCGCC 
CACTTTCTACAGACCCTACTTCAGAATTGTTCGATTTGACGTCTCAGCAATGGAGAAGAATGCTTCCAAT 
TTGGTGAAAGCAGAGTTCAGAGTCTTTCGTTTGCAGAACCCAAAAGCCAGAGTGCCTGAACAACGGATTG 

50 AGCTATATCAGATTCTCAAGTCCAAAGATTTAACATCTCCAACCCAGCGCTACATCGACAGCAAAGTTGT 
GAAAACAAGAGCAGAAGGCGAATGGCTCTCCTTCGATGTAACTGATGCTGTTCATGAATGGCTTCACCAT 
AAAGACAGGAACCTGGGATTTAAAATAAGCTTACACTGTCCCTGCTGCACTTTTGTACCATCTAATAATT 
ACATCATCCCAAATAAAAGTGAAGAACTAGAAGCAAGATTTGCAGGTATTGATGGCACCTCCACATATAC 
CAGTGGTGATCAGAAAACTATAAAGTCCACTAGGAAAAAAAACAGTGGGAAGACCCCACATCTCCTGCTA 

55 ATGTTATTGCCCTCCTACAGACTTGAGTCACAACAGACCAACCGGCGGAAGAAGCGTGCTTTGGATGCGG 
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CCTATTGCTTTAGAAATGTGCAGGATAATTGCTGCCTACGTCCACTTTACATTGATTTCAAGAGGGATCT 
AGGGTGGAAATGGATACACGAACCCAAAGGGTACAATGCCAACTTCTGTGCTGGAGCATGCCCGTATTTA 
TGGAGTTCAGACACTCAGCACAGCAGGGTCCTGAGCTTATATAATACCATAAATCCAGAAGCATCTGCTT 
CTCCTTGCTGCGTGTCCCAAGATTTAGAACCTCTAACCATTCTCTACTACATTGGC7VAAACACCCAAGAT 
6 TGAACAGCTTTCTAATATGATTGTAAAGTCTTGCAAATGCAGCTAAAATTCTTGGAAAAGTGGCAAGACC 
AAAATGACAATGATGATGATAATGATGATGACGACGACAACGATGATGCTTGTAACAAGAAAACATAAGA 
GAGCCTTGGTTCATCAGTGTTAAAAAATTTTTGAAAAGGCGGTACTAGTTCAGACACTTTGGAAGTTTGT 
GTTCTGTTTGTTAAAACTGGCATCTGACACAAAAAAAGTTGAAGGCCTTATTCTACATTTCACCTACTTT 
GTAAGTGAGAGAGACAAGAAGCAAATTTTTTTTAAAGAAAAAAATAAACACTGGAAGAATTTATTAGTGT 

1 0 TAATTATGTGAACAACGACAACAACAACAACAACAACAAACAGGAAAATCCCATTAAGTGGAGTTGCTGT 
ACGTACCGTTCCTATCCCGCGCCTCACTTGATTTTTCTGTATTGCTATGCAATAGGCACCCTTCCCATTC 
TTACTCTTAGAGTTAACAGTGAGTTATTTATTGTGTGTTACTATATAATGAACGTTTCATTGCCCTTGGA 
AAATAAAACAGGTGTATA/^AGTGGAGACCAAATACTTTGCCAGAAACTCATGGATGGCTTAAGGAACTTG 
AACTCAAACGAGCCAGAAAAAAAGAGGTCATATTAATGGGATGAAAACCCAAGTGAGTTATTATATGACC 

15 GAGAAAGTCTGCATTAAGATAAAGACCCTGAAAACACATGTTATGTATCAGCTGCCTAAGGAAGCTTCTT 
GTAAGGTCC7^7W^CTAAAAAGACTGTTAAT7\AAAGAAACTTTCAGTCAG (SEQ ID NO: 6694) 



gi 1 186624 |gb| J04111.1IHUMJUNA Human c-jun proto oncogene (JUN), complete 

20 cds, clone hCJ-1 

CCCGGGGAGGGGACCGGGGAACAGAGGGCCGAGAGGCGTGCGGCAGGGGGGAGGGTAGGAGAAAGAAGGG 
CCCGACTGTAGGAGGGCAGCGGAGCATTACCTCATCCCGTGAGCCTCCGCGGGCCCAGAGAAGAATCTTC 
TAGGGTGGAGTCTCCATGGTGACGGGCGGGCCCGCCCCCCTGAGAGCGACGCGAGCCAATGGGAAGGCCT 
TGGGGTGACATCATGGGCTATTTTTAGGGGTTGACTGGTAGCAGATAAGTGTTGAGCTCGGGCTGGATAA 

25 GGGCTCAGAGTTGCACTGAGTGTGGCTGAAGCAGCGAGGCGGGAGTGGAGGTGCGCGGAGTCAGGCAGAC 
AGACAGACACAGCCAGCCAGCCAGGTCGGCAGTATAGTCCGAACTGCAAATCTTATTTTCTTTTCACCTT 
CTCTCTAACTGCCCAGAGCTAGCGCCTGTGGCTCCCGGGCTGGTGGTTCGGGAGTGTCCAGAGAGCCTTG 
TCTCCAGCCGGCCCCGGGAGGAGAGCCCTGCTGCCCAGGCGCTGTTGACAGCGGCGGAAAGCAGCGGTAC 
CCCACGCGCCCGCCGGGGGACGTCGGCGAGCGGCTGCAGCAGCAAAGAACTTTCCCGGCGGGGAGGACCG 

30 GAGACAAGTGGCAGAGTCCCGGAGCGAACTTTTGCAAGCCTTTCCTGCGTCTTAGGCTTCTCCACGGCGG 
TAAAGACCAGAAGGCGGCGGAGAGCCACGCAAGAGAAGAAGGACGTGCGCTCAGCTTCGCTCGCACCGGT 
TGTTGAACTTGGGCGAGCGCGAGCCGCGGCTGCCGGGCGCCCCCTCCCCCTAGCAGCGGAGGAGGGGACA 
AGTCGTCGGAGTCCGGGCGGCCAAGACCCGCCGCCGGCCGGCCACTGCAGGGTCCGCACTGATCCGCTCC 
GCGGGGAGAGCCGCTGCTCTGGGAAGTGAGTTCGCCTGCGGACTCCGAGGAACCGCTGCGCCCGAAGAGC 

35 GCTCAGTGAGTGACCGCGACTTTTCAAAGCCGGGTAGCGCGCGCGAGTCGACAAGTAAGAGTGCGGGAGG 
CATCTTAATTAACCCTGCGCTCCCTGGAGCGAGCTGGTGAGGAGGGCGCAGCGGGGACGACAGCCAGCGG 
GTGCGTGCGCTCTTAGAGAAACTTTCCCTGTCAAAGGCTCCGGGGGGCGCGGGTGTCCCCCGCTTGCCAG 
AGCCCTGTTGCGGCCCCGAAACTTGTGCGCGCACGCCAAACTAACCTCACGTGAAGTGACGGACTGTTCT 
ATGACTGCAAAGATGGAAACGACCTTCTATGACGATGCCCTCAACGCCTCGTTCCTCCCGTCCGAGAGCG 

40 GACCTTATGGCTACAGTAACCCCAAGATCCTGAAACAGAGCATGACCCTGAACCTGGCCGACCCAGTGGG 
GAGCCTGAAGCCGCACCTCCGCGCCAAGAACTCGGACCTCCTCACCTCGCCCGACGTGGGGCTGCTCAAG 
CTGGCGTCGCCCGAGCTGGAGCGCCTGATAATCCAGTCCAGCAACGGGCACATCACCACCACGCCGACCC 
CCACCCAGTTCCTGTGCCCCAAGAACGTGACAGATGAGCAGGAGGGGTTCGCCGAGGGCTTCGTGCGCGC 
CCTGGCCGAACTGCACAGCCAGAACACGCTGCCCAGCGTCACGTCGGCGGCGCAGCCGGTCAACGGGGCA 

45 GGCATGGTGGCTCCCGCGGTAGCCTCGGTGGCAGGGGGCAGCGGCAGCGGCGGCTTCAGCGCCAGCCTGC 
ACAGCGAGCCGCCGGTCTACGCAAACCTCAGCMCTTCAACCCAGGCGCGCTGAGCAGCGGCGGCGGGGC 
GCCCTCCTACGGCGCGGCCGGCCTGGCCTTTCCCGCGCAACCCCAGCAGCAGCAGCAGCCGCCGCACCAC 
CTGCCCCAGCAGATGCCCGTGCAGCACCCGCGGCTGCAGGCCCTGAAGGAGGAGCCTCAGACAGTGCCCG 
AGATGCCCGGCGAGACACCGCCCCTGTCCCCCATCGACATGGAGTCCCAGGAGCGGATCAAGGCGGAGAG 

50 GAAGCGCATGAGGAACCGCATCGCTGCCTCCAAGTGCCGAAAAAGGAAGCTGGAGAGAATCGCCCGGCTG 
GAGGAAAAAGTGAAAACCTTGAAAGCTCAGAACTCGGAGCTGGCGTCCACGGCCAACATGCTCAGGGAAC 
AGGTGGCACAGCTTAAACAGAAAGTCATGAACCACGTTAACAGTGGGTGCCAACTCATGCTAACGCAGCA 
GTTGCAAACATTTTGAAGAGAGACCGTCGGGGGCTGAGGGGCAACGAAGAAAAAAAATAACACAGAGAGA 
CAGACTTGAGAACTTGACAAGTTGCGACGGAGAGA7y\AAAGAAGTGTCCGAGAACTAAAGCCAAGGGTAT 

55 CCAAGTTGGACTGGGTTCGGTCTGACGGCGCCCCCAGTGTGCACGAGTGGGAA6GACTTGGTCGCGCCCT 
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CCCTTGGCGTGGAGCCAGGGAGCGGCCGCCTGCGGGCTGCCCCGCTTTGCGGACGGGCTGTCCCCGCGCG 
AACGGAACGTTGGACTTTCGTTAACATTGACCAAGAACTGCATGGACCTAACATTCGATCTCATTCAGTA 
TTAAAGGGGGGAGGGGGAGGGGGTTACAAACTGCAATAGAGACTGTAGATTGCTTCTGTAGTACTCCTTA 
AGAACACAAAGCGGGGGGAGGGTTGGGGAGGGGCGGCAGGAGGGAGGTTTGTGAGAGCGAGGCTGAGCCT 
5 ACAGATGAACTCTTTCTGGCCTGCTTTCGTTAACTGTGTATGTACATATATATATTTTTTAATTTGATTA 
AAGCTGATTACTGTCAATAAACAGCTTCATGCCTTTGTAAGTTATTTCTTGTTTGTTTGTTTGGGTATCC 
TGCCCAGTGTTGTTTGTAAATAAGAGATTTGGAGCACTCTGAGTTTACCATTTGTAATAAAGTATATAAT 
TTTTTTATGTTTTGTTTCTGAAAATTCCAGAAAGGATATTTAAGAAAATACAATAAACTATTGGAAAGTA 
CTCCCCTAACCTCTTTTCTGCATCATCTGTAGATCCTAGTCTATCTAGGTGGAGTTGAAAGAGTTAAGAA 

10 TGCTCGATAAAATCACTCTCAGTGCTTCTTACTATTAAGCAGTAAAAACTGTTCTCTATTAGACTTAGAA 
ATAAATGTACCTGATGTACCTGATGCTATGTCAGGCTTCATACTCCACGCTCCCCCAGCGTATCTATATG 
GAATTGCTTACCAAAGGCTAGTGCGATGTTTCAGGAGGCTGGAGGAAGGGGGGTTGCAGTGGAGAGGGAC 
AGCCCACTGAGAAGTCAAACATTTCAAAGTTTGGATTGCATCAAGTGGCATGTGCTGTGACCATTTATAA 
TGTTAGAAATTTTACAATAGGTGCTTATTCTCAAAGCAGGAATTGGTGGCAGATTTTACAAAAGATGTAT 

15 CCTTCCAATTTGGAATCTTCTCTTTGACAATTCCTAGATAAAAAGATGGCCTTTGTCTTATGAATATTTA 
TAACAGCATTCTGTCACAATAAATGTATTCAAATACCAATAACAGATCTTGAATTGCTTCCCTTTACTAC 
TTTTTTGTTCCCAAGTTATATACTGAAGTTTTTATTTTTAGTTGCTGAGGTT (SEQ ID NO: 6695) 



20 gi|179982|gb|M57729.1|HUMCCC5 Human complement component C5 mRNA, complete 
cds 

CTACCTCC?U\CCATGGGCCTTTTGGGAATACTTTGTTTTTTAATCTTCCTGGGGAAAACCTGGGGACAGG 
AGCAAACATATGTCATTTCAGCACCAAAAATATTCCGTGTTGGAGCATCTGAAAATATTGTGATTCAAGT 
TTATGGATACACTGAAGCATTTGATGCAACAATCTCTATTAAAAGTTATCCTGATAAAAAATTTAGTTAC 

25 TCCTCAGGCCATGTTCATTTATCCTCAGAGAATAAATTCCAAAACTCTGCAATCTTAACAATACAACCAA 
AACAATTGCCTGGAGGACAAAACCCAGTTTCTTATGTGTATTTGGAAGTTGTATCAAAGCATTTTTCAAA 
ATCAAAAAGAATGCCAATAACCTATGACAATGGATTTCTCTTCATTCATACAGACAAACCTGTTTATACT 
CCAGACCAGTCAGTAAAAGTTAGAGTTTATTCGTTGAATGACGACTTGAAGCCAGCCAAAAGAGAAACTG 
TCTTAACCTTCATAGATCCTGAAGGATCAGAAGTTGACATGGTAGAAGAAATTGATCATATTGGAATTAT 

30 CTCTTTTCCTGACTTCAAGATTCCGTCTAATCCTAGATATGGTATGTGGACGATCAAGGCTAAATAT7U\A 
GAGGACTTTTCAACAACTGGAACCGCATATTTTGAAGTTAAAGAATATGTCTTGCCACATTTTTCTGTCT 
CAATCGAGCCAGAATATAATTTCATTGGTTACAAGTIACTTTAAGAATTTTGAAATTACTATAAAAGCAAG 
ATATTTTTATAATAAAGTAGTCACTGAGGCTGACGTTTATATCACATTTGGAATAAGAGAAGACTTAAAA 
GATGATCAAAAAGAAATGATGCAAACAGCAATGCAA/U^CACAATGTTGATAAATGGAATTGCTCAAGTCA 

35 CATTTGATTCTGAAACAGCAGTCAAAGAACTGTCATACTACAGTTTAGT^GATTTAAACAACAAGTACCT 
TTATATTGCTGTAACAGTCATAGAGTCTACAGGTGGATTTTCTGAAGAGGCAGAAATACCTGGCATCAAA 
TATGTCCTCTCTCCCTAC7U\ACTGAATTTGGTTGCTACTCCTCTTTTCCTG7VAGCCTGGGATTCCATATC 
CCATCAAGGTGCAGGTTAAAGATTCGCTTGACCAGTTGGTAGGAGGAGTCCCAGTAATACTGAATGCACA 
AACAATTGATGTAAACCAAGAGACATCTGACTTGGATCCAAGCAAAAGTGTAACACGTGTTGATGATGGA 

40 GTAGCTTCCTTTGTGCTTAATCTCCCATCTGGAGTGACGGTGCTGGAGTTTAATGTCAAAACTGATGCTC 
CAGATCTTCCAGAAGAAAATCAGGCCAGGGAAGGTTACCGAGCAATAGCATACTCATCTCTCAGCCAAAG 
TTACCTTTATATTGATTGGACTGAT7ACCATZVAGGCTTTGCTAGTGGGAGAACATCTGAATATTATTGTT 
ACCCCCAAAAGCCCATATATTGACAAAATAACTCACTATAATTACTTGATTTTATCCAAGGGCAAAATTA 
TCCATTTTGGCACGAGGGAGAAATTTTCAGATGCATCTTATCAAAGTATAAACATTCCAGTAACACAGAA 

45 CATGGTTCCTTCATCCCGACTTCTGGTCTATTATATCGTCACAGGAGAACAGACAGCAGAATTAGTGTCT 
GATTCAGTCTGGTTAAATATTGAAGAAAAATGTGGCAACCAGCTCCAGGTTCATCTGTCTCCTGATGCAG 
ATGCATATTCTCCAGGCCAAACTGTGTCTCTTAATATGGCAACTGGAATGGATTCCTGGGTGGCATTAGC 
AGCAGTGGACAGTGCTGTGTATGGAGTCCAAAGAGGAGCCAAAAAGCCCTTGGAAAGAGTATTTCAATTC 
TTAGAGAAGAGTGATCTGGGCTGTGGGGCAGGTGGTGGCCTCAACAATGCCAATGTGTTCCACCTAGCTG 

50 GACTTACCTTCCTCACTAATGCAAATGCAGATGACTCCCAAGAAT^TGATGAACCTTGTAMGAAATTCT 
CAGGCCAAGAAGAACGCTGCAAT^GAAGATAGAAGAAATAGCTGCTAAATATAAACATTCAGTAGTGAAG 
AAATGTTGTTACGATGGAGCCTGCGTTAATAATGATGAAACCTGTGAGCAGCGAGCTGCACGGATTAGTT 
TAGGGCCAAGATGCATCAAAGCTTTCACTGAATGTTGTGTCGTCGCAAGCCAGCTCCGTGCTAATATCTC 
TCATAAAGACATGCAATTGGGAAGGCTACACATGAAGACCCTGTTACCAGTAAGCAAGCCAGAAATTCGG 

55 AGTTATTTTCCAGAAAGCTGGTTGTGGGAAGTTCATCTTGTTCCCAG7\AGAAAACAGTTGCAGTTTGCCC 
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TACCTGATTCTCTAACCACCTGGGAAATTCAAGGCATTGGCATTTCAAACACTGGTATATGTGTTGCTGA 
TACTGTCAAGGCAAAGGTGTTCAAAGATGTCTTCCTGGAAATGT^TATACCATATTCTGTTGTACGAGGA 
GAACAGATCC/y^TTGAAAGGAACTGTTTACAACTATAGGACTTCTGGGATGCAGTTCTGTGTTAAAATGT 
CTGCTGTGGAGGGAATCTGCACTTCGGAAAGCCCAGTCATTGATCATCAGGGCACAAAGTCCTCCAAATG 
5 TGTGCGCCAGAAAGTAGAGGGCTCCTCCAGTCACTTGGTGACATTCACTGTGCTTCCTCTGGAAATTGGC 
CTTCACAACATCAATTTTTCACTGGAGACTTGGTTTGGAAAAGAAATCTTAGTAAAAACATTACGAGTGG 
TGCCAGAAGGTGTCAAAAGGGAAAGCTATTCTGGTGTTACTTTGGATCCTAGGGGTATTTATGGTACCAT 
TAGCAGACGAAAGGAGTTCCCATACAGGATACCCTTAGATTTGGTCCCCAAAACAGAAATCAAAAGGATT 
TTGAGTGTAAAAGGACTGCTTGTAGGTGAGATCTTGTCTGCAGTTCTAAGTCAGGAAGGCATCAATATCC 

10 TAACCCACCTCCCCAAAGGGAGTGCAGAGGCGGAGCTGATGAGCGTTGTCCCAGTATTCTATGTTTTTCA 
CTACCTGGAAACAGGAAATCATTGGAACATTTTTCATTCTGACCCATTAATTGAAAAGCAGAAACTGAAG 
AAAAAATTAAAAGAAGGGATGTTGAGCATTATGTCCTACAGAAATGCTGACTACTCTTACAGTGTGTGGA 
AGGGTGGAAGTGCTAGCACTTGGTTAACAGCTTTTGCTTTAAGAGTACTTGGACAAGTAAATAAATACGT 
AGAGCAGAACCAAAATTCAATTTGTAATTCTTTATTGTGGCTAGTTGAGAATTATCAATTAGATAATGGA 

15 TCTTTCAAGGAAAATTCACAGTATCAACCAATAAAATTACAGGGTACCTTGCCTGTTGAAGCCCGAGAGA 
ACAGCTTATATCTTACAGCCTTTACTGTGATTGGAATTAGAAAGGCTTTCGATATATGCCCCCTGGTGAA 
AATCGACACAGCTCTAATTAT^GCTGACAACTTTCTGCTTGAAAATACACTGCCAGCCCAGAGCACCTTT 
ACATTGGCCATTTCTGCGTATGCTCTTTCCCTGGGAGATAAAACTCACCCACAGTTTCGTTCAATTGTTT 
CAGCTTTGAAGAGAGAAGCTTTGGTTAAAGGTAATCCACCCATTTATCGTTTTTGGAAAGACAATCTTCA 

20 GCATAAAGACAGCTCTGTACCTAACACTGGTACGGCACGTATGGTAGAAACAACTGCCTATGCTTTACTC 
ACCAGTCTGAACTTGAAAGATATAAATTATGTTAACCCAGTCATCAAATGGCTATCAGAAGAGCAGAGGT 
ATGGAGGTGGCTTTTATTCAACCCAGGACACCATCAATGCCATTGAGGGCCTGACGGAATATTCACTCCT 
GGTTAAACAACTCCGCTTGAGTATGGACATCGATGTTTCTTACAAGCATAAAGGTGCCTTACATAATTAT 
AAAATGACAGACAAGAATTTCCTTGGGAGGCCAGTAGAGGTGCTTCTCAATGATGACCTCATTGTCAGTA 

25 CAGGATTTGGCAGTGGCTTGGCTACAGTACATGTAACAACTGTAGTTCACAAAACCAGTACCTCTGAGGA 
AGTTTGCAGCTTTTATTTGAAAATCGATACTCAGGATATTGAAGCATCCCACTACAGAGGCTACGGAAAC 
TCTGATTACA7VACGCATAGTAGCATGTGCCAGCTACAAGCCCAGCAGGGAAGAATCATCATCTGGATCCT 
CTCATGCGGTGATGGACATCTCCTTGCCTACTGGAATCAGTGCAAATGAAGAAGACTTAAAAGCCCTTGT 
GGAAGGGGTGGATCT^CTATTCACTGATTACCAAATCAAAGATGGACATGTTATTCTGCAACTGAATTGG 

30 ATTCCCTCCAGTGATTTCCTTTGTGTACGATTCCGGATATTTGAACTCTTTGAAGTTGGGTTTCTCAGTC 
CTGCCACTTTCACAGTTTACGT^ATACCACAGACCAGATAAACAGTGTACCATGTTTTATAGCACTTCCAA 
TATCAAAATTCAGAAAGTCTGTGAAGGAGCCGCGTGCAAGTGTGTAGAAGCTGATTGTGGGCAAATGCAG 
GAAGAATTGGATCTGACAATCTCTGCAGAGACAAGAAAACAAACAGCATGTAAACCAGAGATTGCATATG 
CTTATAAAGTTAGCATCACATCCATCACTGTAGAAAATGTTTTTGTCAAGTACAAGGCAACCCTTCTGGA 

35 TATCTACAAAACTGGGGAAGCTGTTGCTGAGAAAGACTCTGAGATTACCTTCATTAAAAAGGTAACCTGT 
ACTAACGCTGAGCTGGTAAAAGGAAGACAGTACTTAATTATGGGTAAAGAAGCCCTCCAGATAAAATACA 
ATTTCAGTTTCAGGTACATCTACCCTTTAGATTCCTTGACCTGGATTGAATACTGGCCTAGAGACACAAC 
ATGTTCATCGTGTCAAGCATTTTTAGCTAATTTAGATGAATTTGCCGAAGATATCTTTTTAAATGGATGC 
TAAAATTCCTGAAGTTCAGCTGCATACAGTTTGCACTTATGGACTCCTGTTGTTGAAGTTCGTTTTTTTG 

40 TTTTCTTCTTTTTTTAAACATTCATAGCTGGTCTTATTTGTAAAGCTCACTTTACTTAGAATTAGTGGCA 
CTTGCTTTTATTAGAGAATGATTTCAAATGCTGTAACTTTCTGATIATAACATGGCCTTGGAGGGCATGAA 
GACAGATACTCCTCCAAGGTTATTGGACACCGGAAACAATA7VATTGGAACACCTCCTCAAACCTACCACT 
CAGGAATGTTTGCTGGGGCCGAAAGAACAGTCCATTGAAAGGGAGTATTACAAAAACATGGCCTTTGCTT 
GAAAGAAAATACCAAGGAACAGGAAACTGATCATTAAAGCCTGAGTTTGCTTTC (SEQ ID NO: 6696) 

45 



gi 1 189944 |gb|L05144.1|HUMPHOCAR Homo sapiens (clone lamda-hPEC-3) 
phosphoenolpyruvate carboxykinase (PCKl) mRNA, complete cds 
TGGGAACACAAACTTGCTGGCGGGAAGAGCCCGGAAAGAAACCTGTGGATCTCCCTTCGAGATCATCCAA 

50 AGAGAAGAAAGGTGACCTCACATTCGTGCCCCTTAGCAGCACTCTGCAG7VAATGCCTCCTCAGCTGCAAA 
ACGGCCTGAACCTCTCGGCCAAAGTTGTCCAGGGAAGCCTGGACAGCCTGCCCCAGGCAGTGAGGGAGTT 
TCTCGAGAATAACGCTGAGCTGTGTCAGCCTGATCACATCCACATCTGTGACGGCTCTGAGGAGGAGAAT 
GGGCGGCTTCTGGGCCAGATGGAGGAAGAGGGCATCCTCAGGCGGCTGAAGAAGTATGACAACTGCTGGT 
TGGCTCTCACTGACCCCAGGGATGTGGCCAGGATCGAAAGCAAGACGGTTATCGTCACCCAAGAGCAAAG 

56 AGACACAGTGCCCATCCCCAAAACAGGCCTCAGCCAGCTCGGTCGCTGGATGTCAGAGGAGGATTTTGAG 
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AAAGCGTTCAATGCCAGGTTCCCAGGGTGCATGAAAGGTCGCACCATGTACGTCATCCCATTCAGCATGG 
GGCCGCTGGGCTCACCTCTGTCGAAGATCGGCATCGAGCTGACGGATTCGCCCTACGTGGTGGCCAGCAT 
GCGGATCATGACGCGGATGGGCACGCCCGTCCTGGAAGCACTGGGCGATGGGGAGTTTGTCAAATGCCTC 
CATTCTGTGGGGTGCCCTCTGCCTTTACAAAAGCCTTTGGTCAACAACTGGCCCTGCAACCCGGAGCTGA 
5 CGCTCATCGCCCACCTGCCTGACCGCAGAGAGATCATCTCCTTTGGCAGTGGGTACGGCGGGAACTCGCT 
GCTCGGGAAGAAGTGCTTTGCTCTCAGGATGGCCAGCCGGCTGGCAGAGGAGGAAGGGTGGCTGGCAGAG 
CACATGCTGATTCTGGGTATAACCAACCCTGAGGGTGAGAAGAAGTACCTGGCGGCCGCATTTCCCAGCG 
CCTGCGGGAAGACCAACCTGGCCATGATGAACCCCAGCCTCCCCGGGTGGAAGGTTGAGTGCGTCGGGGA 
TGACATTGCCTGGATGAAGTTTGACGCACAAGGTCATTTAAGGGCCATCAACCCAGAAAATGGCTTTTTC 

10 GGTGTCGCTCCTGGGACTTCAGTGAAGACCAACCCCAATGCCATCAAGACCATCCAG7^G7\ACACAATCT 
TTACCAATGTGGCCGAGACCAGCGACGGGGGCGTTTACTGGGAAGGCATTGATGAGCCGCTAGCTTCAGG 
CGTCACCATCACGTCCTGGAAGAATAAGGAGTGGAGCTCAGAGGATGGGGAACCTTGTGCCCACCCCAAC 
TCGAGGTTCTGCACCCCTGCCAGCCAGTGCCCCATCATTGATGCTGCCTGGGAGTCTCCGGAAGGTGTTC 
CCATTGAAGGCATTATCTTTGGAGGCCGTAGACCTGCTGGTGTCCCTCTAGTCTATGAAGCTCTCAGCTG 

15 GCAACATGGAGTCTTTGTGGGGGCGGCCATGAGATCAGAGGCCACAGCGGCTGCAGAACATAAAGGCAAA 
ATCATCATGCATGACCCCTTTGCCATGCGGCCCTTCTTTGGCTACAACTTCGGCAAATACCTGGCCCACT 
GGCTTAGCATGGCCCAGCACCCAGCAGCCAAACTGCCCAAGATCTTCCATGTCAACTGGTTCCGGAAGGA 
CAAGGAAGGCAAATTCCTCTGGCCAGGCTTTGGAGAGAACTCCAGGGTGCTGGAGTGGATGTTCT^CCGG 
ATCGATGGAAAAGCCAGCACCAACGTCACGCCCATAGGCTACATCCCCAAGGAGGATGCCCTGAACCTGA 

20 AAGGCCTGGGGCACATCAACATGATGGAGCTTTTCAGCATCTCCAAGGAATTCTGGGACAAGGAGGTGGA 
AGACATCGAGAAGTATCTGGTGGATCAAGTCAATGCCGACCTCCCCTGTGAAATCGAGAGAGAGATCCTT 
GCCTTGAAGCAAAGTU^TAAGCCAGATGTAATCAGGGCCTGAGAATAAGCCAGATGTAATCAGGGCCTGAG 
TGCTTTACCTTTAAAATCATTAAATTAAAATCCATAAGGTGCAGTAGGAGCAAGAGAGGGCAAGTGTTCC 
CAAATTGACGCCACCTAATAATCATCACCACACCGGGAGCAGATCTGAAGGCACACTTTGATTTTTTTAA 

26 GGATAAGAACCACAGAACACTGGGTAGTAGCTAATGAAATTGAGAAGGGAAATCTTAGCATGCCTCCAAA 
AATTCACATCCAATGCATACTTTGTTCAAATTTAAGGTTACTCAGGCATTGATCTTTTCAGTGTTTTTTC 
ACTTAGCTATGTGGATTAGCTAGAATGCACACCAAAAAGATACTTGAGCTGTATATATATATGTGTGTGT 
GTGTGTGTGTGTGTGTGTGTGTGCATGTATGTGCACATGTGTCTGTGTGATATTTGGTATGTGTATTTGT 
ATGTACTGTTATTCAAAATATATTTAATACCTTTGGAAAATCTTGGGCAAGATGACCTACTAGTTTTCCT 

30 TGAAAAAAAGTTGCTTTGTTATTAATATTGTGCTTAAATTATTTTTATACACCATTGTTCCTTACCTTTA 
CATAATTGCAATATTTCCCCCTTACTACTTCTTGGAAAAAAATTAGAAAATGAAGTTTATAGAAAAG 
(SEQ ID NO: 6697) 



36 gi| 6679892|ref |NM_008061.1| Mus musculus glucose-6-phosphatase, catalytic 
(G6pc), mRNA 

AGCAGAGGGATCGGGGCCAACCGGGCTTGGACTCACTGCACGGGCTCTGCTGGCAGCTTCCTGAGGTACC 
AAGGGAGGAAGGATGGAGGAAGGAATGAACATTCTCCATGACTTTGGGATCCAGTCGACTCGCTATCTCC 
AAGTGAATTACCAAGACTCCCAGGACTGGTTCATCCTTGTGTCTGTGATTGCTGACCTGAGGAACGCCTT 

40 CTATGTCCTCTTTCCCATCTGGTTCCATCTTAAAGAGACTGTGGGCATCAATCTCCTCTGGGTGGCAGTG 
GTCGGAGACTGGTTC7U\CCTCGTCTTC7^GTGGATTCTGTTTGGACAACGCCCGTATTGGTGGGTCCTGG 
ACACCGACTACTACAGCAACAGCTCCGTGCCTATAATAAAGCAGTTCCCTGTCACCTGTGAGACCGGACC 
AGGAAGTCCCTCTGGCCATGCCATGGGCGCAGCAGGTGTATACTATGTTATGGTCACTTCTACTCTTGCT 
ATCTTTCGAGGAAAGAAAAAGCCAACGTATGGATTCCGGTGTTTGAACGTCATCTTGTGGTTGGGATTCT 

46 GGGCTGTGCAGCTGAACGTCTGTCTGTCCCGGATCTACCTTGCTGCTCACTTTCCCCACCAGGTCGTGGC 
TGGAGTCTTGTCAGGCATTGCTGTGGCTGAAACTTTCAGCCACATCCGGGGCATCTACAATGCCAGCCTC 
CGGAAGTATTGTCTCATCACCATCTTCTTGTTTGGTTTCGCGCTTGGATTCTACCTGCTACTAAAAGGGC 
TAGGGGTGGACCTCCTGTGGACTTTGGAGAAAGCCAAGAGATGGTGTGAGCGGCCAGAATGGGTCCACCT 
TGACACTACACCCTTTGCCAGCCTCTTCAAAAACCTGGGAACCCTCTTGGGGTTGGGGCTGGCCCTC7\AC 

50 TCCAGCATGTACCGGAAGAGCTGCAAGGGAGAACTCAGCAAGTCGTTCCCATTCCGCTTCGCCTGCATTG 
TGGCTTCCTTGGTCCTCCTGCATCTCTTTGACTCTCTGAAGCCCCCATCCCAGGTTGAGTTGATCTTCTA 
CATCTTGTCTTTCTGCAAGAGCGCAACAGTTCCCTTTGCATCTGTCAGTCTTATCCCATACTGCCTAGCC 
CGGATCCTGGGACAGACACACAAGAAGTCTTTGTAAGGCATGCAGAGTCTTTGGTATTTAAAGTCAACCG 
CCATGCAAAGGACTAGGAACAACTAAAGCCTCTGAAACCCATTGTGAGGCCAGAGGTGTTGACATCGGCC 

55 CTGGTAGCCCTGTCTTTCTTTGCTATCTTAACCAAAAGGTGAATTTTTACAAAGCTTACAGGGCTGTTTG 
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AGGAAAGTGTGAATGCTGGAAACTGAGTCATTCTGGATGGTTCCCTGAAGATTCGCTTACCAGCCTCCTG 
TCAGATACAGAAGAGCAAGCCCAGGCTAGAGATCCCAACTGAGAATGCTCTTGCGGTGCAGAATCTTCCG 
GCTGGGAAAAGGT^AAAGAGCACCATGCATTTGCCAGGAAGAGAAAGAAGGATCGGGAGGAGGGAGAGTGT 
TTTATGTATCGAGCAT^CCAGATGCAATCTATGTCTAACCGGCTTCAGTTGTGTCTGCGTCTTTAGATAC 
5 GACACACTCAATAATAATAATAGACCAACTAGTGTAATGAGTAGCCAGTTAAAGGCGATTAATTCTGCTT 
CCAGATAGTCTCCACTGTACATAAAAGTCACACTGTGTGCTTGCATTCCTGTATGGTAGTGGTGACTGTC 
TCTCACACCACCTTCTCTATCACGTCACAGTTTTCTCCTCCTCAGCCTATGTCTGCATTCCCCAGAATTC 
TCCACTTGTTCCCTGGCCCTGCTGCTGGACCCTGCT6TGTCTGGTAGGCAACTGTTTGTTGGTGCTTTTG 

tagggttaagttaaactctgagatcttgggc;vaaatggcaaggagacccaggattcttctctcca;\aggt 
10 cactccgatgttatttttgattcctggggcagaaatatgactcctttccctagcccaagccagccaagag 
ctctcattcttagaagatiaaggcagccccttggtgcctgtcctcctgcctcggctgatttgcagagtact 
tcttcaaaaagaaaaaaatggtaaagctatttattaaaaattctttgttttttgctacaaatgatgcata 
tattttcacccacaccaagcactttgtttctaatatctttgataagaaaactacatgtgcagtattttat 
taaagcaacattttattta (seq id no: 6698) 

15 



gi|71l06821ref |NM_011044 .11 Mus mus cuius phosphoenolpyruvate carboxy kinase 
1, cytosolic (Pckl), mRNA 

ACAGTTGGCCTTCCCTCTGGGAACACACCCTCGGTCAACAGGGGAAATCCGGCAAGGCGCTCAGCGATCT 

20 CTGATCCAGACCTTCCAAAAGGAAGAAAGGTGGCACCAGAGTTCCTGCCTCTCTCCACACCATTGCAATT 
ATGCCTCCTCAGCTGCATAACGGTCTGGACTTCTCTGCCAAGGTTATCCAGGGCAGCCTCGACAGCCTGC 
CCCAGGCAGTGAGGAAGTTCGTGGAAGGCAATGCTCAGCTGTGCCAGCCGGAGTATATCCACATCTGCGA 
TGGCTCCGAGGAGGAGTACGGGCAGTTGCTGGCCCACATGCAGGAGGAGGGTGTCATCCGCAAGCTGAAG 
AAATATGACAACTGTTGGCTGGCTCTCACTGACCCTCGAGATGTGGCCAGGATCGAAAGCAAGACAGTCA 

25 TCATCACCCAAGAGCAGAGAGACACAGTGCCCATCCCCAAAACTGGCCTCAGCCAGCTGGGCCGCTGGAT 
GTCGGAAGAGGACTTTGAGAAAGCATTCAACGCCAGGTTCCCAGGGTGCATGAAAGGCCGCACCATGTAT 
GTCATCCCATTCAGCATGGGGCCACTGGGCTCGCCGCTGGCC7VAGATTGGTATTGAACTGACAGACTCGC 
CCTATGTGGTGGCCAGCATGCGGATCATGACTCGGATGGGCATATCTGTGCTGGAGGCCCTGGGAGATGG 
GGAGTTCATCAAGTGCCTGCACTCTGTGGGGTGCCCTCTCCCCTTAAAAAAGCCTTTGGTCAACAACTGG 

30 GCCTGCAACCCTGAGCTGACCCTGATCGCCCACCTCCCGGACCGCAGAGAGATCATCTCCTTTGGAAGCG 
GATATGGTGGGAACTCACTACTCGGGAAGAAATGCTTTGCGTTGCGGATCGCCAGCCGTCTGGCTAAGGA 
GGAAGGGTGGCTGGCGGAGCATATGCTGATCCTGGGCATAACTAACCCCGAAGGCAAGAAGAT^TACCTG 
GCCGCAGCCTTCCCTAGTGCCTGTGGGAAGACTAACTTGGCCATGATGAACCCCAGCCTGCCCGGGTGGA 
AGGTCGAATGTGTGGGCGATGACATTGCCTGGATGAAGTTTGATGCCCAAGGCAACTTAAGGGCTATCAA 

35 CCCAGAAAACGGGTTTTTTGGAGTTGCTCCTGGCACCTCAGTGAAGACAAATCCAAATGCCATTAAAACC 
ATCCAGAAAAACACCATCTTCACCAACGTGGCCGAGACTAGCGATGGGGGTGTTTACTGGGAAGGCATCG 
ATGAGCCGCTGGCCCCGGGAGTCACCATCACCT.CCTGGAAGAACAAGGAGTGGAGACCGCAGGACGCGGA 
ACCATGTGCCCATCCCAACTCGAGATTCTGCACCCCTGCCAGCCAGTGCCCCATTATTGACCCTGCCTGG 
GAATCTCCAGAAGGAGTACCCATTGAGGGTATCATCTTTGGTGGCCGTAGACCTGAAGGTGTCCCCCTTG 

40 TCTATGAAGCCCTCAGCTGGCAGCATGGGGTGTTTGTAGGAGCAGCCATGAGATCTGAGGCCACAGCTGC 
TGCAGAACACAAGGGCAAGATCATCATGCACGACCCCTTTGCCATGCGACCCTTCTTCGGCTACAACTTC 
GGCAAATACCTGGCCCACTGGCTGAGCATGGCCCACCGCCCAGCAGCCAAGTTGCCCAAGATCTTCCATG 
TCAACTGGTTCCGGAAGGACAAAGATGGCAAGTTCCTCTGGCCAGGCTTTGGCGAGAACTCCCGGGTGCT 
GGAGTGGATGTTCGGGCGGATTGAAGGGGAAGACAGCGCCAAGCTCACGCCCATCGGCTACATCCCTAAG 

45 GT^AAACGCCTTGAACCTGAAAGGCCTGGGGGGCGTCAACGTGGAGGAGCTGTTTGGGATCTCTAAGGAGT 
TCTGGGAGAAGGAGGTGGAGGAGATCGACAGGTATCTGGAGGACCAGGTCAACACCGACCTCCCTTACGA 
TVATTGAGAGGGAGCTCCGAGCCCTGAAACAGAGAATCAGCCAGATGTAAATCCCAATGGGGGCGTCTCGA 
GAGTCACCCCTTCCCACTCACAGCATCGCTGAGATCTAGGAGAAAGCCAGCCTGCTCCAGCTTTGAGATA 
GCGGCACAATCGTGAGTAGATCAGAAAAGCACCTTTTAATAGTCAGTTGAGTAGCACAGAGAACAGGCTA 

60 GGGGCAAATAAGATTGGGAGGGGAAATCACCGCATAGTCTCTGAAGTTTGCATTTGACACCAATGGGGGT 
TTTGGTTCCACTTCAAGGTCACTCAGGAATCCAGTTCTTCACGTTAGCTGTAGCAGTTAGCTAAAATGCA 
CAGAAAACATACTTGAGCTGTATATATGTGTGTGAACGTGTCTCTGTGTGAGCATGTGTGTGTGTGTGTG 
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTACATGCCTGTCTGTCCCATTGTCCACAGTATATTTAA 
AACCTTTGGGGAAZ^TCTTGGGCAAATTTGTAGCTGTAACTAGAGAGTCATGTTGCTTTGTTGCTAGTA 

55 TGTATGTTTAAATTATTTTTATACACCGCCCTTACCTTTCTTTACATAATTGAAATTGGTATCCGGACCA 
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CTTCTTGGGAAAAAAATTACAAAATAAA (SEQ ID NO: 6699) 
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Example 6 siRNAs decrease mRNA levels in viva 
Male CMV-Luc mice (8-10 weeks old) from Xenogen (Cranbury, NJ) were 
administered cholesterol conjugated siRNA (see Table 16). 



5 Table 16. Solutions adminstered to mice 



GrouD 


n 


Iniection Mis 


1 


7 


Buffer (PBS [pH 7.4]) 


2 


8 


Cholesterol conjugated siRNA 
(ALN-3001) 



Table 17. Test IRNA agents targeting Luciferase 



SiRNA 


Sequence 


ALN-1070 


5'-GAA CUG UGU GUG AGA GGU CCU-3' (SEQ ID NO: 6700) 
3'-CG CUU GAG ACA CAC UCU CCA GGA-5' (SEQ ID NO: 6701) 


ALN-1000 


5'-GAA CUG UGU GUG AGA GGU CCU-GS-3' (SEQ ID NO: 6702) 
3'-CG CUU GAC ACA CAC UCU CCA GGA-5' (SEQ ID NO: 6703) 


ALN-3000 


5'-GAA CUG UGU GUG AGA GGU CCU-3' (SEQ ID NO: 6704) 
3'-Cs^Gs^ CUU GAC ACA CAC UCU CCA GGA-5' (SEQ ID NO: 6705) 


ALN-3001 


5'-GAA CUG UGU GUG AGA GGU CCU-chol . ^-3' (SEQ ID NO: 6706) 
3'-Cs^Gs^ CUU GAC ACA CAC UCU CCA GGA-5' (SEQ ID NO: 6707) 



2' 0-Me group is attached to the nucleotide and the nucleotides have phosphorothioate linkages 
10 (indicated by "s") 

2 cholesterol is conjugated to the antisense strand via the linker: U-pyrroline carrier-C(0)-(CH2)3- 
NHC(0)-cholesterol (via cholesterol C-3 hydroxy 1). 

Animals, were injected (tail vein) with a volume of 200-250 ^1 test solution containing 
15 buffer or an siRNA solution. Group 1 received buffer and group 2 received cholesterol 

conjugated siRNA (ALN-3001) at a dose of 50 mg/kg body weight. Twenty-two hours after 
injection, animals were sacrificed and livers collected. Organs were snap frozen on dry ice, 
then pulverized in a mortar and pestle. 

For Luciferase mRNA analysis (by the QuantiGene Assay (Genospectra, Inc.; 
20 Fremont, CA)), approximately 10 mg of tissue powder was resuspended in tissue lysis buffer, 
and processed according to the manufacturer's protocol. Samples of the lysate were 
hybridized with probes specific for Luciferase or GAPDH (designed using ProbeDesigner 
software (Genospectra, Inc., Fremont, CA) in triplicate, and processed for luminometric 
analysis. Values for Luciferase were normalized to GAPDH. Mean values were plotted with 
25 error bars corresponding to the standard deviation of the Luciferase measurements. 
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Results indicated that the level of luciferase KNA in animals injected with cholesterol 
conjugated siRNA was reduced by about 70% as compared to animals injected with buffer 
(see FIGs 6A and 6b). 

6 In Vitro Activity 

HeLa cells expressing luciferase were transfected with each of the siRNAs listed in 
Table 17. ALN-1000 siRNAs were most effective at decreasing luciferase mRNA levels 
(-0.6 nM siRNA decreased mRNA levels to about ^65% the original expression level, and 
1 .0 nM siRNA decreased levels to about --20% the original expression level); ALN-3001 

10 siRNAs were least effective ('-0.6 nM siRNA had a negligible mRNA levels, and 1 .0 nM 
siRNA decreased levels to about '^0% the original expression level). 

Pharmacokinetics/Biodistribution 

Pharmacokinetic analyses were performed in mice and rats. Test siRNA molecules 
15 were radioactively labeled with ^^P on the antisense strand by splint ligation. Labeled 
siRNAs (50mg/kg) were administered by tail vein injection, and plasma levels of siRNA 
were measured periodically over 24 hrs by scintillation coimting. Cholesterol conjugated 
siRNA (ALN-3001) was discovered to circulate in mouse plasma for a longer period time 
than unconjugated siRNA (ALN-3000) (FIG. 7). RNAse protection assays indicated that 
20 cholesterol-conjugated siRNA (ALN-3001) was detectable in mouse plasma 12 hours after 
injection, whereas unconjugated siRNA (ALN-3000) was not detectable in mouse plasma 
within two hours following injection. Similar results were observed in rats. 

Mouse liver was harvested at varying time points (rangmg from 0.08-24 hoiurs) 
following injection with siRNA, and siRNA localized to the liver was quantified. Over the 
26 time period tested, the amoxmt of cholesterol-conjugated siRNA (ALN-3001) detected in the 
liver ranged from 1 4.3-3. SS percent of the total dose administered to the mouse. The amount 
of unconjugated siRNA (ALN-3000) detected in the liver was lower, rangmg from 3.91- 
1 .75 percent of the total dose administered. 
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Detection of siRNA in Different Tissues 

Various tissues and organs (fat, heart, kidney, liver, and spleen) were harvested from 
two CMV-Luc mice 22 hours following injection with 50 mg/kg ALN-3001 . The antisense 
strand of the siRNA was detected by RNAse protection assay. The liver contained the 
greatest concentration of siRNA (--8-10 jig siRNA/g tissue); the spleen, heart and kidney 
contained lesser amounts of siRNA (-2-7 ng siRNA/g tissue); and fat tissue contained the 
least amount of siRNA (<~1 ^g siRNA/g tissue). 

Glucose-6-phosphatase siRNA detection by RNAse Protection Assay 
Balbc mice were injected with UAJ, 3'CAJ, or 3' C/3' C siRNA (4 mg/kg) targeting 
glucose-6-phosphatase (G6Pase) (see Table 18). Administration was by hydrodynamic tail 
vein injection (hd) or non-hydrodynamic tail vein injection (iv), and siRNA was 
subsequently detected in the liver by RNAse protection assay. 



Table 18. Test iRNA agents targeting glucose-6«phosphatase 



SiRNA 


Description 


U/U 


No cholesterol; dinucleotide 3' overhangs on sense and antisense strands 


3'C/U 


dinucleotide 3 ' overhangs on sense and antisense strands; cholesterol 
conjugated to 3' end of sense strand (mono-conjugate) 


3'C/3'C 


dinucleotide 3 ' overhangs on sense and antisense strands; cholesterol 
conjugated to 3' end of both sense and antisense strands (bis-conjugate) 



Unconjugated siRNA (UAJ) delivered by hd was detected by 15 min. post-injection 
(the earliest determined time-point) and was still detectable in the liver 1 8 hours post- 
injection. 

Delivery by normal iv administration resulted in the greatest concentration of 3 'C/3'C 
siRNA (the bis-cholesterol-conjugate) in the liver 1 hour post injection (as compared to the 
mono-cholesterol-conjugate 3'C/3'U siRNA). At 18 hours post injection, 3'C/3'C siRNAs 
and 3 'C/U siRNA were still detectable in the liver with tlie bis-conjugate at higher levels 
compared to the mono-conjugate. 

While this invention has been particularly shown and described with reference to 
preferred embodiments thereof, it will be understood by those skilled in the art that various 
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invention encompassed by the appended claims. 
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WHAT IS CLAIMED IS: 

1 . An iRNA agent comprising a sense sequence and an antisense sequence, wherein 
5 the sense sequence has one or more asymmetrical 2'-0 alkyl modifications and the antisense 

sequence has one or more asymmetrical phosphorothioate modifications, and the antisense 
sequence targets a human gene sequence. 

2. The iRNA agent of claim 1, wherein at least one of said 2'-0-alkyl modifications 
10 is a 2'-0Me modification. 

3. The iRNA agent of claim 1, wherein the sense sequence has at least 2 
asymmetrical 2'-0 alkyl modifications. 

15 4. The iRNA agent of claim 1, wherein the sense has at least 4 asymmetrical 2'-0 

alkyl modifications. ^ 

5. The iRNA agent of claim 4, wherein the asymmetrical modifications are 2'-0Me 
modifications. 

20 

6. The iRNA agent of claim 1, wherein the sense sequence has at least 6 
asymmetrical 2'-0 alkyl modifications. 

7. The iRNA agent of claim 6, wherein the asymmetrical modifications are 2'"0Me 
25 modifications. 

8. The iRNA agent of claim 1, wherein the sense sequence has at least 8 
asymmetrical 2'-0 alkyl modifications. 

30 9. The iRNA agent of claim 8, wherein the asymmetrical modifications are 2'-0Me 

modifications. 
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10. The iRNA agent of claim 1, wherein all of the subunits of the sense sequence 
have an asymmetrical 2*-0 alkyl modification. 

5 11. The IRNA agent of claim 10, wherein the asymmetrical modifications ai*e 2'-0Me 

modifications. 

12. The iRNA agent of claim 1, wherein the antisense sequence has at least 2 
asymmetrical phosphorothioate modifications. 

10 

13. The iRNA agent of claim 1, wherein the antisense sequence has at least 4 
asymmetrical phosphorothioate modifications. 

14. The iRNA agent of claim 1, wherein the antisense sequence has at least 6 
1 5 asynmietrical phosphorothioate modifications. 

15. The iRNA agent of claim 1, wherein the antisense sequence has at least 8 
asymmetrical phosphorothioate modifications. 

20 16. The iRNA agent of claim 1, wherein all of the subunits of the sense sequence 

have an asymmetrical phosphorothioate modification. 

17. The iRNA agent of claim 1, wherein the sense and antisense sequences are on 
different RNA strands. 

25 

18. The iRNA agent of claun 1, wherein the sense and antisense sequences are on the 
same RNA strand. 

19. The iRNA agent of claim 1, wherein the sense and antisense sequences are fiilly 
30 complementary to each other. 
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20. The iKNA agent of claim 1, further comprising a cholesterol moiety. 

21. The iRNA agent of claim 20, wherein said cholesterol moiety is coupled to a 
sense strand. 

5 

22. The iRNA agent of claim 20, further comprising a second cholesterol moiety. 

23. The iRNA agent of claim 22, wherein said second cholesterol moiety is coupled 
to a sense strand. 

10 

24. The iRNA agent of claim 1, wherein said human gene is an oncogene. 

25. The iRNA agent of claim 1, wherein said human gene is the apoB-100 gene. 

15 26. The iRNA agent of claim 1, wherein said human gene is the glucose-6- 

phosphatase gene. 

27. The iRNA agent of claim 1, wherein the said himian gene is the beta catenin 

gene. 

20 

28. The iRNA agent of claim 1, wherein the iRNA agent is at least 21 nucleotides in 
length, and the duplex region of the iRNA is about 19 nucleotides in length. 

29. The iRNA agent of claim 1, having a duplex region of about 19 subunits in 
25 length and one or two 3' overhangs of about 2 subunits in length. 

30. A pharmaceutical preparation comprising the iRNA agent of claim 1. 

31. A method for reducing apoB-100 levels in a subject comprising administering to 
30 a subject an iRNA agent comprising a sense strand sequence and an antisense sequence, 

wherein the sense sequence has at least 4 asymmetrical 2'-0 alkyl modifications and the 
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antisense sequence has at least 4 asymmetrical phosphorothioate modifications, and the 
antisense sequence targets apoB-100. 

32» The method of claim 31, wherein the subject is suffering from a disorder 
characterized by elevated or otherwise unwanted expression of apoB-100, elevated or 
otherwise unwanted levels of cholesterol, and/or disregulation of lipid metabolism. 

33. The method of claim 32, wherein said disorder is chosen form the group of 
HDL/LDL cholesterol imbalance; dyslipidemias; hypercholestorolemia; statin-resistant 
hypercholesterolemia; coronary artery disease (CAD) coronary heart disease (CHD) 
atherosclerosis 

34. A method for reducing glucose-6-phosphatase levels in a subject comprising 
administering to a subject an iRNA agent comprising a sense strand sequence and an 
antisense sequence, wherein the sense sequence has at least 4 asymmetrical 2'-0 alkyl 
modifications and the antisense sequence has at leaist 4 asymmetrical phosphorothioate 
modifications, and the antisense sequence targets glucose-6-phosphatase. 

35. The method of claim 34, wherem the iRNA agent is administered to a subject to 
inhibit hepatic glucose production, or for the treatment of a glucose-metabolism-related 
disorder. 

36. The method of claim 35, wherein said disorder is diabetes. 

37. The method of claim 35, wherein said disorder is type-2 diabetes. 

38. A method of making an iRNA agent, the method comprismg: 
providing a sense strand sequence having at least 4 asymmetrical 2'-0 alkyl 

modifications and an antisense sequence having at least 4 asymmetrical phosphorothioate 
modifications, and allowing the sense and antisense strand to hybridize. 
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39. A method of stabilizing an iRNA agent, comprising selecting a sequence with 
activity, and introducing one or more asymmetrical modification in said sequence, wherein 
said modification decreases nuclease sensitivity while not decreasing activity. 
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